{"id":403,"date":"2017-07-15T22:02:14","date_gmt":"2017-07-15T21:02:14","guid":{"rendered":"https:\/\/www.pmd85.cz\/?page_id=403"},"modified":"2018-02-25T07:37:19","modified_gmt":"2018-02-25T06:37:19","slug":"akcelerace-procedury-kresleni-obrazku-na-pmd-85","status":"publish","type":"page","link":"https:\/\/www.pmd85.cz\/?page_id=403","title":{"rendered":"Akcelerace procedury kreslen\u00ed obr\u00e1zk\u016f na PMD 85"},"content":{"rendered":"<p>Jak postupn\u011b p\u00ed\u0161u Trailblazer a pou\u017e\u00edv\u00e1m jednotliv\u00e9 moduly, p\u0159i\u0161la \u0159ada i na proceduru pro kreslen\u00ed obr\u00e1zk\u016f. A tak jsem si \u0159\u00edkal, \u017ee bych zde mohl uve\u0159ejnit p\u0159\u00edklad, jak lze postupovat p\u0159i postupn\u00e9 akceleraci n\u011bjak\u00e9 procedury. A\u00a0procedura pro kreslen\u00ed obr\u00e1zk\u016f se p\u0159\u00edmo nab\u00edz\u00ed.<\/p>\n<p>Jako v\u00fdchoz\u00ed model, ke kter\u00e9mu budu vztahovat n\u00e1sledn\u00e9 \u010dasov\u00e9 \u00faspory, si zvol\u00edm n\u011bjak\u00fd jednoduch\u00fd algoritmus kreslen\u00ed obr\u00e1zku. Nen\u00ed \u00famysln\u011b roztahan\u00fd, aby ty \u00faspory vypadaly p\u011bkn\u011b. Prost\u011b je to doslovn\u00fd p\u0159eklad toho, jak pomoc\u00ed dvou smy\u010dek, pro osu X a osu Y beru grafick\u00e1 data p\u0159edlohy a kop\u00edruji je do videoram. N\u00e1sleduj\u00ed \u010dty\u0159i postupn\u00e9 evolu\u010dn\u00ed kroky, kter\u00fdmi se budu sna\u017eit zmen\u0161it po\u010det takt\u016f CPU, nutn\u00fdch pro vykreslen\u00ed postavi\u010dky o rozm\u011bru 18&#215;23 pixel\u016f, tedy 3 &#8222;videobajty&#8220; na \u0161\u00ed\u0159ku a 23 mikro\u0159\u00e1dk\u016f na v\u00fd\u0161ku.<\/p>\n<p><a href=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/proc1.txt\" target=\"_blank\" rel=\"noopener noreferrer\">V\u00fdchoz\u00ed neoptimalizovan\u00e1 procedura kreslen\u00ed<\/a><\/p>\n<p>Neoptimalizovan\u00e1 procedura trv\u00e1 3986 takt\u016f CPU. Jako prvn\u00ed \u00fasporn\u00e9 opat\u0159en\u00ed se nab\u00edz\u00ed vynech\u00e1n\u00ed vnit\u0159n\u00ed smy\u010dky FOR\/NEXT pro data v ose X a jej\u00ed nahrazen\u00ed trojn\u00e1sobn\u00fdm ops\u00e1n\u00edm t\u011bla p\u016fvodn\u00edho cyklu. Nutno \u0159\u00edci, \u017ee je to jedna ze dvou nejv\u011bt\u0161\u00edch dosa\u017eiteln\u00fdch \u00faspor.<\/p>\n<p><a href=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/proc2.txt\" target=\"_blank\" rel=\"noopener noreferrer\">Optimalizace \u010d.1<\/a><\/p>\n<p>Po t\u00e9to dietn\u00ed k\u016f\u0159e se dost\u00e1v\u00e1me na 2678 takt\u016f CPU, co\u017e p\u0159edstavuje 67% p\u016fvodn\u00ed doby. Co ale te\u010f? Zam\u011b\u0159\u00edme se na to, co se st\u00e1le dokola vykon\u00e1v\u00e1, a p\u0159itom to nen\u00ed nositelem po\u017eadovan\u00e9 funkce. Instrukce PUSH PSW a POP PSW na za\u010d\u00e1tku a na konci t\u011bla cyklu. Pokud vyhrad\u00edme prom\u011bnn\u00e9 cyklu jej\u00ed vlastn\u00ed registr, kter\u00fd nebude pou\u017e\u00edv\u00e1n pro \u017e\u00e1dn\u00e9 dal\u0161\u00ed \u00fa\u010dely, mohlo by to n\u011bco p\u0159in\u00e9st. A taky \u017ee jo. Ov\u0161em zadarmo to nen\u00ed. Mus\u00edme zapojit do hry registr SP a s t\u00edm spojenou re\u017eii pro uschov\u00e1n\u00ed a obnoven\u00ed jeho hodnoty p\u0159ed a po procedu\u0159e.<\/p>\n<p><a href=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/proc3.txt\" target=\"_blank\" rel=\"noopener noreferrer\">Optimalizace \u010d.2<\/a><\/p>\n<p>Tato optimalizace p\u0159inesla zkr\u00e1cen\u00ed na 2252 takt\u016f CPU, tedy 56% doby p\u016fvodn\u00ed neoptimalizovan\u00e9 procedury. Co je nyn\u00ed nejdel\u0161\u00edho na pou\u017eit\u00e9m algoritmu kreslen\u00ed? No p\u0159ece nejdel\u0161\u00ed instrukce (z hlediska po\u010dtu takt\u016f). A to je instrukce DAD. Ano, je tam v t\u011ble cyklu jen jednou jedinkr\u00e1t, ale i to\u00a0m\u016f\u017ee b\u00fdt p\u0159edm\u011btem reduk\u010dn\u00ed diety. Jak s n\u00ed ven? No to nep\u016fjde. Mus\u00ed tam b\u00fdt. Ov\u0161em ne tak \u010dasto, ne na konci ka\u017ed\u00e9ho mikro\u0159\u00e1dku. Pokud vhodn\u011b p\u0159eskl\u00e1d\u00e1me obrazov\u00e1 data p\u0159edlohy, m\u016f\u017eeme kreslit obr\u00e1zek v tomto po\u0159ad\u00ed:<\/p>\n<ul>\n<li>0. mikro\u0159\u00e1dek zleva doprava<\/li>\n<li>4. mikro\u0159\u00e1dek zprava doleva<\/li>\n<li>8. mikro\u00e1dek zleva doprava<\/li>\n<li>12. mikro\u0159\u00e1dek zprava doleva<\/li>\n<li>a tak d\u00e1le..<\/li>\n<\/ul>\n<p>P\u0159\u00ednosem je, \u017ee k v\u00fdpo\u010dtu adresy nov\u00e9ho mikro\u0159\u00e1dku nepot\u0159ebuji instrukci DAD o d\u00e9lce 10 takt\u016f ale pouze instrukci INR H o d\u00e9lce 5 takt\u016f. On toti\u017e posun adresy ve videoram o hodnotu 256 (to je to INR H) je pr\u00e1v\u011b posunem o 4 mikro\u0159\u00e1dky n\u00ed\u017ee. A jen jednou po n\u011bkolika mikro\u0159\u00e1dc\u00edch se instrukc\u00ed DAD vr\u00e1t\u00edm do lev\u00e9ho horn\u00edho rohu obr\u00e1zku, ov\u0161em o jeden mikro\u0159\u00e1dek n\u00ed\u017ee ne\u017e minule. Po p\u0159esn\u011b \u010dty\u0159ech pr\u016fchodech obr\u00e1zkem je obr\u00e1zek hotov. V\u00fdsledn\u00e9 \u0159e\u0161en\u00ed uv\u00e1d\u00ed n\u00e1sleduj\u00edc\u00ed Optimalizace \u010d.3:<\/p>\n<p><a href=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/proc4.txt\" target=\"_blank\" rel=\"noopener noreferrer\">Optimalizace \u010d.3<\/a><\/p>\n<p>Tyto \u00fapravy zkr\u00e1tily proceduru kreslen\u00ed obr\u00e1zku na 1944 takt\u016f CPU, co\u017e je 49% doby kreslen\u00ed p\u016fvodn\u00ed procedury. Ov\u0161em n\u00e1r\u016fst bajtov\u00e9 d\u00e9lky k\u00f3du je zna\u010dn\u00fd. Posledn\u00ed optimalizace p\u0159inese nejen m\u00edrn\u00e9 zkr\u00e1cen\u00ed ale i druhou nejv\u011bt\u0161\u00ed redukci doby, nutn\u00e9 pro vykreslen\u00ed obr\u00e1zku. A \u010d\u00edm to? Pro p\u0159enos dat vyu\u017eijeme nikoliv instrukce p\u0159en\u00e1\u0161ej\u00edc\u00ed jeden bajt (MOV, LDAX, STAX) ale takov\u00e9 instrukce, kter\u00e9 p\u0159en\u00e1\u0161ej\u00ed najednou bajty dva. Takov\u00e9 instrukce jsou z rodiny instrukc\u00ed pro pr\u00e1ci se z\u00e1sobn\u00edkem (POP, PUSH) a maj\u00ed tu v\u00fdhodu, \u017ee na p\u0159enos t\u011bch dvou bajt\u016f mus\u00ed CPU na\u010d\u00edst pouze jednou opera\u010dn\u00ed k\u00f3d instrukce. Do videoram takto zapisovat sice obecn\u011b lze (instrukc\u00ed PUSH &#8211; d\u011bl\u00e1 to tak BIOS PMD 85-2 u procedury CLS), ale v na\u0161em p\u0159\u00edpad\u011b\u00a0by to kolidovalo\u00a0s lichou\u00a0\u0161\u00ed\u0159kou obr\u00e1zku. Tak\u017ee budeme ekvivalentn\u011b pomoc\u00ed instrukce POP na\u010d\u00edtat data p\u0159edlohy obr\u00e1zku. Registr SP jsme u\u017e stejn\u011b pou\u017eili v p\u0159edchoz\u00edm pokusu, tak\u017ee nav\u00fd\u0161en\u00ed re\u017eie pro jeho obsluhu u\u017e bylo zaplaceno.<\/p>\n<p><a href=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/proc5.txt\" target=\"_blank\" rel=\"noopener noreferrer\">Optimalizace \u010d.4<\/a><\/p>\n<p>Zastavili jsme se na 1457 taktech CPU, co\u017e obn\u00e1\u0161\u00ed 37% doby p\u016fvodn\u00ed neoptimalizovan\u00e9 kreslic\u00ed procedury, kter\u00e1 t\u011bch takt\u016f pot\u0159ebovala 3986. Tak tedy shrnut\u00ed. Obecn\u011b plat\u00ed pro \u010dasov\u00e9 optimalizace k\u00f3du n\u00e1sleduj\u00edc\u00ed pravidla.<\/p>\n<ul>\n<li>Re\u017eie smy\u010dek typu FOR\/NEXT (tedy r\u016fzn\u00e9 PUSH, POP, DCR, JNZ) mus\u00ed b\u00fdt z \u010dasov\u00e9ho pohledu zanedbateln\u00e1 v\u016f\u010di dob\u011b trv\u00e1n\u00ed vlastn\u00edho t\u011bla cyklu. Kdy\u017e by bylo ne\u00fanosn\u00e9 opisovat t\u011blo cyklu nap\u0159\u00edklad 100x, op\u00ed\u0161u t\u011blo cyklu nap\u0159\u00edklad 4x a nech\u00e1m cyklus opakovat 25x. U\u017e toto p\u0159in\u00e1\u0161\u00ed \u00faspory n\u011bkdy v des\u00edtk\u00e1ch procent (viz grafick\u00e9 procedury v BIOSu PMD 85-2).<\/li>\n<li>Pro prom\u011bnn\u00e9, se kter\u00fdmi se \u010dasto pracuje (nap\u0159\u00edklad zm\u00edn\u011bn\u00e9 prom\u011bnn\u00e9 cyklu) pou\u017e\u00edvat\u00a0ty registry CPU, kter\u00e9 nebudou pou\u017eity pro jin\u00e9 \u010dinnosti.<\/li>\n<li>Vybrat pro\u00a0po\u017eadovanou funkci\u00a0instrukce s minim\u00e1ln\u00edm pot\u0159ebn\u00fdm po\u010dtem takt\u016f.<\/li>\n<li>P\u0159enosy velk\u00fdch objem\u016f dat (typicky grafika) realizovat za pomoci instrukc\u00ed POP\/PUSH\/XTHL.<\/li>\n<\/ul>\n<p>V\u00fd\u0161e uveden\u00e9 bude ov\u0161em v praxi vykoupeno v\u011bt\u0161\u00ed d\u00e9lkou programu. Je to n\u011bco za n\u011bco. M\u00e1te na v\u00fdb\u011br. Ale zp\u011bt k na\u0161emu p\u0159\u00edkladu. Graficky zn\u00e1zor\u0148uje pom\u011br rychlost\u00ed vykreslov\u00e1n\u00ed n\u00e1sleduj\u00edc\u00ed obr\u00e1zek. Vlevo je postavi\u010dka, k jej\u00edmu\u017e &#8222;pohybu&#8220; je pou\u017eita p\u016fvodn\u00ed, nejpomalej\u0161\u00ed procedura kreslen\u00ed, vpravo pak postavi\u010dka s nejrychlej\u0161\u00ed kreslic\u00ed procedurou. Postavi\u010dky startuj\u00ed naho\u0159e a pohybuj\u00ed se sm\u011brem dol\u016f. Prvn\u00ed postavi\u010dka, kter\u00e1 dos\u00e1hne doln\u00edho konce obrazovky, z\u00e1vod ukon\u010d\u00ed.<\/p>\n<p><a href=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/showbmp.gif\"><br \/>\n<\/a><a href=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/showbmp.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignleft wp-image-418 size-thumbnail\" src=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/showbmp-150x150.gif\" alt=\"\" width=\"150\" height=\"150\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>Pokud byste cht\u011bli vid\u011bt vlastn\u00ed z\u00e1vod postavi\u010dek, m\u016f\u017eete si do emul\u00e1toru PMD 85 od RM TEAMu nahr\u00e1t n\u00e1sleduj\u00edc\u00ed program, pou\u017e\u00edvaj\u00edc\u00ed multitasking pro p\u011bt\u00a0uveden\u00fdch kreslic\u00edch procedur.<\/p>\n<p><a href=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/showbmp.zip\" target=\"_blank\" rel=\"noopener noreferrer\">Soubor virtu\u00e1ln\u00ed MGF p\u00e1sky\u00a0se z\u00e1vodem postavi\u010dek<\/a><\/p>\n<p><a href=\"https:\/\/www.pmd85.cz\/wp-content\/uploads\/showbmp.txt\" target=\"_blank\" rel=\"noopener noreferrer\">Zdrojov\u00fd k\u00f3d z\u00e1vodu postavi\u010dek<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Jak postupn\u011b p\u00ed\u0161u Trailblazer a pou\u017e\u00edv\u00e1m jednotliv\u00e9 moduly, p\u0159i\u0161la \u0159ada i na proceduru pro kreslen\u00ed obr\u00e1zk\u016f. A tak jsem si \u0159\u00edkal, \u017ee bych zde mohl uve\u0159ejnit p\u0159\u00edklad, jak lze postupovat p\u0159i postupn\u00e9 akceleraci n\u011bjak\u00e9 procedury. A\u00a0procedura pro kreslen\u00ed obr\u00e1zk\u016f se p\u0159\u00edmo nab\u00edz\u00ed. Jako v\u00fdchoz\u00ed model, ke kter\u00e9mu budu vztahovat n\u00e1sledn\u00e9 \u010dasov\u00e9 \u00faspory, si zvol\u00edm n\u011bjak\u00fd [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":337,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/www.pmd85.cz\/index.php?rest_route=\/wp\/v2\/pages\/403"}],"collection":[{"href":"https:\/\/www.pmd85.cz\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.pmd85.cz\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.pmd85.cz\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pmd85.cz\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=403"}],"version-history":[{"count":16,"href":"https:\/\/www.pmd85.cz\/index.php?rest_route=\/wp\/v2\/pages\/403\/revisions"}],"predecessor-version":[{"id":523,"href":"https:\/\/www.pmd85.cz\/index.php?rest_route=\/wp\/v2\/pages\/403\/revisions\/523"}],"up":[{"embeddable":true,"href":"https:\/\/www.pmd85.cz\/index.php?rest_route=\/wp\/v2\/pages\/337"}],"wp:attachment":[{"href":"https:\/\/www.pmd85.cz\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=403"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}