Opravil jsem smycky se vstupem s pointery pro memory variantu.
Kdyz jsem to poprve prohnal pres svuj testovaci program tak skoro nic nefungovalo... .)
I kdyz testuji vlastne jen jednu smycku.
Kód:
include(`../M4/FIRST.M4')dnl
ORG 0x8000
INIT(0x8000)
PRINT_I({"+10000 4321..54321 ", 0x0D})
VARIABLE(_counter,0)
VARIABLE(_index,0)
VARIABLE(from,4321)
VARIABLE(stop,54321)
VARIABLE(step,10000)
PUSH(54321) PUSH(4321) DO I CALL(_test) PUSH(10000) ADDLOOP CALL(_show)
PUSH(54321) __ASM PUSH(4321) DO I CALL(_test) PUSH(10000) ADDLOOP CALL(_show)
PUSH([stop]) PUSH(4321) DO I CALL(_test) PUSH(10000) ADDLOOP CALL(_show)
PUSH(54321) PUSH(4321) __ASM DO I CALL(_test) PUSH(10000) ADDLOOP CALL(_show)
PUSH([stop]) PUSH([from]) DO I CALL(_test) PUSH(10000) ADDLOOP CALL(_show)
PUSH(4321) __ASM PUSH(54321) SWAP DO I CALL(_test) PUSH(10000) ADDLOOP CALL(_show)
PUSH(54321) PUSH([from]) DO I CALL(_test) PUSH(10000) ADDLOOP CALL(_show)
CR
PUSH(54321) PUSH(4321) DO I CALL(_test) PUSH(10000) __ASM ADDLOOP CALL(_show)
PUSH(54321) __ASM PUSH(4321) DO I CALL(_test) PUSH(10000) __ASM ADDLOOP CALL(_show)
PUSH([stop]) PUSH(4321) DO I CALL(_test) PUSH(10000) __ASM ADDLOOP CALL(_show)
PUSH(54321) PUSH(4321) __ASM DO I CALL(_test) PUSH(10000) __ASM ADDLOOP CALL(_show)
PUSH([stop]) PUSH([from]) DO I CALL(_test) PUSH(10000) __ASM ADDLOOP CALL(_show)
PUSH(4321) __ASM PUSH(54321) SWAP DO I CALL(_test) PUSH(10000) __ASM ADDLOOP CALL(_show)
PUSH(54321) PUSH([from]) DO I CALL(_test) PUSH(10000) __ASM ADDLOOP CALL(_show)
CR
PUSH(54321) PUSH(4321) DO I CALL(_test) PUSH([step]) ADDLOOP CALL(_show)
PUSH(54321) __ASM PUSH(4321) DO I CALL(_test) PUSH([step]) ADDLOOP CALL(_show)
PUSH([stop]) PUSH(4321) DO I CALL(_test) PUSH([step]) ADDLOOP CALL(_show)
PUSH(54321) PUSH(4321) __ASM DO I CALL(_test) PUSH([step]) ADDLOOP CALL(_show)
PUSH([stop]) PUSH([from]) DO I CALL(_test) PUSH([step]) ADDLOOP CALL(_show)
PUSH(4321) __ASM PUSH(54321) SWAP DO I CALL(_test) PUSH([step]) ADDLOOP CALL(_show)
PUSH(54321) PUSH([from]) DO I CALL(_test) PUSH([step]) ADDLOOP CALL(_show)
STOP
COLON(_test)
PUSH(_counter) FETCH _1ADD PUSH(_counter) STORE
PUSH(_index) STORE
SEMICOLON
COLON(_show)
PUSH(_index) FETCH DOT
PUSH(_counter) FETCH SPACE UDOT
PUSH(0) PUSH(_index) STORE
PUSH(0) PUSH(_counter) STORE
CR
SEMICOLON
Kdyz jsem se kouknul na puvodni kod tak me to ani nedavalo smysl.
Misto zjistovani jak vypada situace s carry pro ruzne znamenka "step" pri:
"index-stop" + "step"
Jsem tam resil "stop" - "index+step". No asi to smysl dava pro zakladni situace, ale urcite to nefunguje pokazde.
Takze napriklad kdyz znate STEP a START
Kód:
dworkin@dw-A15:~/Programovani/ZX/M4_FORTH-Version-2023-7-23-$ ./check_word.sh 'PUSH(pocatek) DO PUSH(1000) ADDLOOP'
dec HL ; 1:6 pocatek do_101(m)
ld A, L ; 1:4 pocatek do_101(m)
ld (stp_lo101), A ; 3:13 pocatek do_101(m) lo stop-1
ld A, H ; 1:4 pocatek do_101(m)
ld (stp_hi101), A ; 3:13 pocatek do_101(m) hi stop-1
ld HL, pocatek ; 3:10 pocatek do_101(m)
ld (idx101), HL ; 3:16 pocatek do_101(m)
pop HL ; 1:10 pocatek do_101(m)
ex DE, HL ; 1:4 pocatek do_101(m)
do101: ; pocatek do_101(m)
push HL ; 1:11 1000 +loop_101(m)
idx101 EQU $+1 ; 1000 +loop_101(m)
ld HL, 0x0000 ; 3:10 1000 +loop_101(m)
ld BC, 1000 ; 3:10 1000 +loop_101(m) BC = step
add HL, BC ; 1:11 1000 +loop_101(m) HL = index+step
ld (idx101), HL ; 3:16 1000 +loop_101(m) save index
stp_lo101 EQU $+1 ; 1000 +loop_101(m)
ld A, 0xFF ; 2:7 1000 +loop_101(m) lo stop-1
sub L ; 1:4 1000 +loop_101(m)
ld L, A ; 1:4 1000 +loop_101(m)
stp_hi101 EQU $+1 ; 1000 +loop_101(m)
ld A, 0xFF ; 2:7 1000 +loop_101(m) hi stop-1
sbc A, H ; 1:4 1000 +loop_101(m)
ld H, A ; 1:4 1000 +loop_101(m) HL = stop-(index+step)
add HL, BC ; 1:11 1000 +loop_101(m) HL = stop-index
xor H ; 1:4 1000 +loop_101(m)
pop HL ; 1:10 1000 +loop_101(m)
jp p, do101 ; 3:10 1000 +loop_101(m)
leave101: ; 1000 +loop_101(m)
exit101: ; 1000 +loop_101(m)
; seconds: 1 ;[42:203]
Zrusil jsem pracne v DO, aby v techto pripadech se "stop" rozdelil na 2 bajty a mel kod nejak:
Kód:
dworkin@dw-A15:~/Stažené/M4_FORTH-master$ ./check_word.sh 'PUSH(pocatek) DO PUSH(1000) ADDLOOP'
ld [stp101], HL ; 3:16 pocatek do_101(m) ( stop pocatek -- )
ld HL, pocatek ; 3:10 pocatek do_101(m) HL = index
ld [idx101], HL ; 3:16 pocatek do_101(m) save index
ex DE, HL ; 1:4 pocatek do_101(m)
pop DE ; 1:10 pocatek do_101(m)
do101: ; pocatek do_101(m)
;[25:143] 1000 +loop_101(m) version stop from stack
push DE ; 1:11 1000 +loop_101(m)
push HL ; 1:11 1000 +loop_101(m)
idx101 EQU $+1 ; 1000 +loop_101(m)
ld HL, 0x0000 ; 3:10 1000 +loop_101(m) HL = index
stp101 EQU $+1 ; 1000 +loop_101(m)
ld BC, 0x0000 ; 3:10 1000 +loop_101(m) BC = stop
ld DE, 0x03E8 ; 3:10 1000 +loop_101(m) DE = step
or A ; 1:4 1000 +loop_101(m)
sbc HL, BC ; 2:15 1000 +loop_101(m) HL = index-stop
add HL, DE ; 1:11 1000 +loop_101(m) HL = index-stop+step
sbc A, A ; 1:4 1000 +loop_101(m)
add HL, BC ; 1:11 1000 +loop_101(m) HL = index+step
ld [idx101], HL ; 3:16 1000 +loop_101(m) save index
pop HL ; 1:10 1000 +loop_101(m)
pop DE ; 1:10 1000 +loop_101(m)
jp p, do101 ; 3:10 1000 +loop_101(m) positive step
; seconds: 1 ;[36:199]
A v tehle chvili jsem ale vyzkousel co se stane kdyz u memory varianty nepotrebuji mit na konci v nejakem registru vyslednou hodnotu "index+step" a mohu to vypocitat jako prvni hodnotu
index+step
ulozit novy index
"index+step" - "stop"
a ted teprve zjistovat carry pro operaci
"index+step-stop" - "step"
Kdyz to funguje opacne pro "index-stop" + "step" a zjistujeme zda to prelezlo nulu tak musi fungovat i zase ten step odecist od vysledku.
Jen se teda ta hodnota carry otoci, protoze misto odecitani "step" efektivneji pricitam zapornou hodnotu "step".
U predchoziho vypoctu jsem musel mit 3 volne 16-bitove registry. Pro index a mezivysledky a pro stop a 2x pouzity step. Ten "stop" jde vyhodit a resit to 8-bitove pomoci akumulatoru ale to je delsi a myslim max o jeden takt rychlejsi a nebo ani to ne. Tam se komplikuje i DO slovo.
Tady vychazi kupovidu rychleji kdyz se "step" nacte znova do BC, jednou zaporne. Dalsi vyhoda je ze po otestovani carry uz nemusim nic pocitat a nic me tak nemaze priznaky.
Kód:
dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'PUSH(pocatek) DO PUSH(step) ADDLOOP'
ld [stp101], HL ; 3:16 pocatek do_101(m) ( stop pocatek -- )
ld HL, pocatek ; 3:10 pocatek do_101(m) HL = index
ld [idx101], HL ; 3:16 pocatek do_101(m) save index
ex DE, HL ; 1:4 pocatek do_101(m)
pop DE ; 1:10 pocatek do_101(m)
do101: ; pocatek do_101(m)
;[25:128] step +loop_101(m) version stop from stack
push HL ; 1:11 step +loop_101(m)
idx101 EQU $+1 ; step +loop_101(m)
ld HL, 0x0000 ; 3:10 step +loop_101(m) HL = index
ld BC, step ; 3:10 step +loop_101(m) BC = step
add HL, BC ; 1:11 step +loop_101(m) HL = index+step
ld [idx101], HL ; 3:16 step +loop_101(m) save index
stp101 EQU $+1 ; step +loop_101(m)
ld BC, 0x0000 ; 3:10 step +loop_101(m) BC = stop
or A ; 1:4 step +loop_101(m)
sbc HL, BC ; 2:15 step +loop_101(m) HL = index+step-stop
ld BC, 0-step ; 3:10 step +loop_101(m)
add HL, BC ; 1:11 step +loop_101(m) HL = index-stop
pop HL ; 1:10 step +loop_101(m)
if ((0x8000 & (step)) = 0)
jp c, do101 ; 3:10 step +loop_101(m) positive step
else
jp nc, do101 ; 3:10 step +loop_101(m) negative step
endif
leave101: ; step +loop_101(m)
exit101: ; step +loop_101(m)
; seconds: 1 ;[39:194]
+LOOP je stejne velky, ale kod je rychlejsi.
Vtip je v tom ze v tomto pripade mohu ten prvni vypocet "index+step" provest pres makro __ADD_R16_CONST a to me to umi jeste zoptimalizovat kdyz je tam nejake pekna hodnota jako je napriklad "step" = -2
Kód:
dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'PUSH([konec]) DO PUSH(-2) ADDLOOP'
ld [stp101], HL ; 3:16 [konec] do_101(m) ( stop [konec] -- )
ld HL,[konec] ; 3:16 [konec] do_101(m) HL = index
ld [idx101], HL ; 3:16 [konec] do_101(m) save index
ex DE, HL ; 1:4 [konec] do_101(m)
pop DE ; 1:10 [konec] do_101(m)
do101: ; [konec] do_101(m)
;[23:119] -2 +loop_101(m) version stop from stack
push HL ; 1:11 -2 +loop_101(m)
idx101 EQU $+1 ; -2 +loop_101(m)
ld HL, 0x0000 ; 3:10 -2 +loop_101(m) HL = index
dec HL ; 1:6 -2 +loop_101(m)
dec HL ; 1:6 -2 +loop_101(m) HL = index+step
ld [idx101], HL ; 3:16 -2 +loop_101(m) save index
stp101 EQU $+1 ; -2 +loop_101(m)
ld BC, 0x0000 ; 3:10 -2 +loop_101(m) BC = stop
or A ; 1:4 -2 +loop_101(m)
sbc HL, BC ; 2:15 -2 +loop_101(m) HL = index+step-stop
ld BC, 0x0002 ; 3:10 -2 +loop_101(m)
add HL, BC ; 1:11 -2 +loop_101(m) HL = index-stop
pop HL ; 1:10 -2 +loop_101(m)
jp nc, do101 ; 3:10 -2 +loop_101(m) negative step
leave101: ; -2 +loop_101(m)
exit101: ; -2 +loop_101(m)
; seconds: 1 ;[34:181]
Tak to zameni za 2x "dec HL".
Pokud je "stop" misto ze zasobniku brano z ukazatele tak to vypada obdobne. Vice se to zmeni kdyz je to cislo a nebo promenna.
Puvodne:
Kód:
dworkin@dw-A15:~/Programovani/ZX/M4_FORTH-Version-2023-7-23-$ ./check_word.sh 'PUSH(konec) SWAP DO PUSH(1000) ADDLOOP'
ld (idx101), HL ; 3:16 konec swap do_101(m) ( konec index -- )
pop HL ; 1:10 konec swap do_101(m)
ex DE, HL ; 1:4 konec swap do_101(m)
do101: ; konec swap do_101(m)
push HL ; 1:11 1000 +loop_101(m)
idx101 EQU $+1 ; 1000 +loop_101(m)
ld HL, 0x0000 ; 3:10 1000 +loop_101(m)
ld BC, 1000 ; 3:10 1000 +loop_101(m) BC = step
add HL, BC ; 1:11 1000 +loop_101(m) HL = index+step
ld (idx101), HL ; 3:16 1000 +loop_101(m) save index
stp_lo101 EQU $+1 ; 1000 +loop_101(m)
ld A, low +(konec)-1; 2:7 1000 +loop_101(m) lo stop-1
sub L ; 1:4 1000 +loop_101(m)
ld L, A ; 1:4 1000 +loop_101(m)
stp_hi101 EQU $+1 ; 1000 +loop_101(m)
ld A, high +(konec)-1; 2:7 1000 +loop_101(m) hi stop-1
sbc A, H ; 1:4 1000 +loop_101(m)
ld H, A ; 1:4 1000 +loop_101(m) HL = stop-(index+step)
add HL, BC ; 1:11 1000 +loop_101(m) HL = stop-index
xor H ; 1:4 1000 +loop_101(m)
pop HL ; 1:10 1000 +loop_101(m)
jp p, do101 ; 3:10 1000 +loop_101(m)
leave101: ; 1000 +loop_101(m)
exit101: ; 1000 +loop_101(m)
; seconds: 1 ;[30:153]
Nove:
Kód:
dworkin@dw-A15:~/Stažené/M4_FORTH-master$ ./check_word.sh 'PUSH(konec) SWAP DO PUSH(1000) ADDLOOP'
ld [idx101], HL ; 3:16 konec swap do_101(m) ( konec index -- )
ex DE, HL ; 1:4 konec swap do_101(m)
pop DE ; 1:10 konec swap do_101(m)
do101: ; konec swap do_101(m)
;[25:143] 1000 +loop_101(m) version default
push DE ; 1:11 1000 +loop_101(m)
push HL ; 1:11 1000 +loop_101(m)
idx101 EQU $+1 ; 1000 +loop_101(m)
ld HL, 0x0000 ; 3:10 1000 +loop_101(m) HL = index
ld DE, 0x03E8 ; 3:10 1000 +loop_101(m) DE = step
ld BC, konec ; 3:10 1000 +loop_101(m) BC = stop
or A ; 1:4 1000 +loop_101(m)
sbc HL, BC ; 2:15 1000 +loop_101(m) HL = index-stop
add HL, DE ; 1:11 1000 +loop_101(m) HL = index-stop+step
sbc A, A ; 1:4 1000 +loop_101(m) carry to sign
add HL, BC ; 1:11 1000 +loop_101(m) HL = index+step
ld [idx101], HL ; 3:16 1000 +loop_101(m) save index
pop HL ; 1:10 1000 +loop_101(m)
pop DE ; 1:10 1000 +loop_101(m)
jp p, do101 ; 3:10 1000 +loop_101(m) positive step
; seconds: 1 ;[30:173]
Po vylepseni:
Kód:
dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'PUSH(konec) SWAP DO PUSH(1000) ADDLOOP'
ld [idx101], HL ; 3:16 konec swap do_101(m) ( konec index -- )
ex DE, HL ; 1:4 konec swap do_101(m)
pop DE ; 1:10 konec swap do_101(m)
do101: ; konec swap do_101(m)
;[23:120] 1000 +loop_101(m) version default
push HL ; 1:11 1000 +loop_101(m)
idx101 EQU $+1 ; 1000 +loop_101(m)
ld HL, 0x0000 ; 3:10 1000 +loop_101(m) HL = index
ld BC, 0x03E8 ; 3:10 1000 +loop_101(m) BC = step
add HL, BC ; 1:11 1000 +loop_101(m) HL = index+step
ld [idx101], HL ; 3:16 1000 +loop_101(m) save index
ld BC, 0-konec ; 3:10 1000 +loop_101(m) BC = -stop
add HL, BC ; 1:11 1000 +loop_101(m) HL = index+step-stop
ld BC, 0xFC18 ; 3:10 1000 +loop_101(m) BC = -step
add HL, BC ; 1:11 1000 +loop_101(m) HL = index-stop
pop HL ; 1:10 1000 +loop_101(m)
jp c, do101 ; 3:10 1000 +loop_101(m) positive step
leave101: ; 1000 +loop_101(m)
exit101: ; 1000 +loop_101(m)
; seconds: 1 ;[28:150]
S tim ze jde optimalizovat i to odecteni "stop" pri vhodnych hodnotach.
Kód:
dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'PUSH(256) SWAP DO PUSH(-2) ADDLOOP'
ld [idx101], HL ; 3:16 256 swap do_101(m) ( 256 index -- )
ex DE, HL ; 1:4 256 swap do_101(m)
pop DE ; 1:10 256 swap do_101(m)
do101: ; 256 swap do_101(m)
;[18:94] -2 +loop_101(m) version default
push HL ; 1:11 -2 +loop_101(m)
idx101 EQU $+1 ; -2 +loop_101(m)
ld HL, 0x0000 ; 3:10 -2 +loop_101(m) HL = index
dec HL ; 1:6 -2 +loop_101(m)
dec HL ; 1:6 -2 +loop_101(m) HL = index+step
ld [idx101], HL ; 3:16 -2 +loop_101(m) save index
dec H ; 1:4 -2 +loop_101(m) HL = index+step-stop
ld BC, 0x0002 ; 3:10 -2 +loop_101(m) BC = -step
add HL, BC ; 1:11 -2 +loop_101(m) HL = index-stop
pop HL ; 1:10 -2 +loop_101(m)
jp nc, do101 ; 3:10 -2 +loop_101(m) negative step
leave101: ; -2 +loop_101(m)
exit101: ; -2 +loop_101(m)
; seconds: 1 ;[23:124]
atd.
PS: Hlavni teda je ze to prochazi novymi testy.