Page 1 of 1
Call for help with usecode
Posted: Wed Mar 03, 2010 1:11 pm
by artaxerxes
Hi all,
as mentioned in
http://exult.info/forum/viewtopic.php?p=332697#p332697 I need some help to setup something for the translation.
Here is the issue: the books contain so much text, that some functions *just* fits the 16 bit usecode (in particular the 0x02C1 function). Translated to French (which takes about 30% more space), it blows over. That's why Exult has supported 32 bit usecode for a while, using the '.ext32' header.
The problem is, in order to use this header, you have to rip, decompile, add the header, translate, recompile and glue the functions. It works but it's a pain. It would be so much better if using a tool, one could convert the usecode to use phrase numbers instead of string offsets, so the translation could almost be done directly on the usecode file (no need to blow it up) and have the revert function too (which recalculates the correct offsets), so that Exult could run the resulting usecode.
For instance, the code part of the 02C1 function starts with (from wud for clarity):
Code: Select all
.code
.argc 0001H
.localc 0002H
.externsize 0000H
0000: 48 push eventid
0001: 1F 01 00 pushi 0001H ; 1
0004: 22 cmpeq
0005: 05 BA 06 jne 06C2
0008: 3E push itemref
0009: 1F 5E 00 pushi 005EH ; 94
000C: 39 A1 00 02 calli _play_sound_effect2@2 (00A1)
0010: 3E push itemref
0011: 39 6A 00 01 calli _book_mode@1 (006A)
0015: 3E push itemref
0016: 38 1C 00 01 callis _get_item_quality@1 (001C)
001A: 12 00 00 pop [0000]
001D: 21 00 00 push [0000]
0020: 1F 00 00 pushi 0000H ; 0
0023: 22 cmpeq
0024: 05 0F 00 jne 0036
0027: 1C 00 00 addsi32 L0000 ; ~~PHILIPHUS'S WIS...
002A: 33 say
002B: 1C 29 00 addsi32 L0029 ; ~~An enlightening...
002E: 33 say
002F: 1C 63 00 addsi32 L0063 ; ~Beginning with t...
0032: 33 say
0033: 06 8C 06 jmp 06C2
Ultimately, using that hypothetical tool, line 0027, 002B and 002F would be replaced temporarily with:
0027: 1C 00 00 addsi32 L0000 // first translated phrase
...
002B: 1C 00 00 addsi32 L0001 // second translated phrase
...
002F: 1C 00 00 addsi32 L0002 // third translated phrase
Then using the revert function, all the offsets would be updated in-situ.
Would anyone feel like tackling this? I've tried, but I failed.
Artaxerxes
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 1:34 pm
by jkchakkal
That's right, my hero. We make it easy, the translation.
Although this difficult part of programming.
It would be great to create a translation file in a extra folder for the file USECODE not been changed.
as the fonts.vga or gumps.vga in the folder PATCH
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 2:02 pm
by jkchakkal
Hey guys...
may find help here
http://wwwwolf.livejournal.com/tag/ultima
sorry for posting 2 times in a row
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 3:47 pm
by Malignant Manor
Is there a reference with all of the needed offsets for both games? As an alternate solution, It probably wouldn't be too hard for Exult to read from a txt file like it does for various other things like exultmsg.txt. Then you could have an example English file with all the text in place. The translators could then just translate the text from it. Is this not possible due to not all versions having an English usecode file?
This doesn't work so well for mods. The source should be available for them so it isn't too hard. It would be much more time consuming though as it means going through many files and filtering through code.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 4:01 pm
by artaxerxes
I see, so where Exult would see "say phrase at offset 0x23F3 for function 0x2D31" it would take a file in "translation/fr-fr/2D31.flx" and grab the text at index "23F3". That would work for me.
Artaxerxes
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 4:04 pm
by Malignant Manor
Is this not possible due to not all versions having an English usecode file?
This is meant as all versions needing translated.
Say you have the line
Hello World
and H is offset A and the point after d is offset A_2.
Exult would start reading the text file at listing A when it comes to the same offset while reading the usecode. It would start reading the usecode file again at offset A_2. All translation text would be in one file instead of the current 1237.
A:Hello World
B:Foo Bar
C: another example
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 4:26 pm
by Malignant Manor
I'm just not sure if the offsets should be stored in the translation file or just numbering each line. The offsets being in the txt file could account for more language usecode files without needing to hard code them into Exult.
I like this, it should work for any language usecode as long as the offsets are correct, and relies very little on Exult coding
Code: Select all
00000006_00000064:@The sails must be furled before the planks are raised.@
000000B5_000000FF:WARNING: $Temp1Flag is set!!! Report at what point in the game you are in.
as opposed to the easier to read but hard coded and only suitable for language usecode files already coded.
Code: Select all
0x000:@The sails must be furled before the planks are raised.@
0x001:WARNING: $Temp1Flag is set!!! Report at what point in the game you are in.
The first seems much better from almost all viewpoints. It's also probably something that could possibly be implemented for the new release. If it doesn't work properly or not all offsets are available, no harm done.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 4:44 pm
by artaxerxes
a couple of points:
1) usecode is made up of a collection of functions, and each function points to its own strings, as if it was namespaced. So there are pretty much as many functions with offset "0000" as there are functions where text data is involved. Following your idea, you would need to prepend the function number as a namespace, otherwise you will have many "0000" in there.
2) I like the idea of
Code: Select all
00000006_00000064:@The sails must be furled before the planks are raised.@
000000B5_000000FF:WARNING: $Temp1Flag is set!!! Report at what point in the game you are in.
but why even bother with the part after the underscore? The first part is enough and since this solution doesn't alter the usecode, there is no need to make it 32bits.
Combining point 1) and 2), we could get:
Code: Select all
0401:0000:@The sails must be furled before the planks are raised.@
0401:00B5:WARNING: $Temp1Flag is set!!! Report at what point in the game you are in.
Artaxerxes
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 4:48 pm
by Malignant Manor
Unfortunately, I don't know how to code this, but it seems pretty simple. The real work would be writing the English example text files with all the proper offsets.
As for the translation file, maybe they can be read from a patch folder. It could have a name translation_french.txt. The first part before the underline tells us that it a translation file, the part afterwords could go into a cfg file variable.
If also readable from a data folder, maybe translation_blackgate_french.txt and translation_serpentisle_french.txt. This being almost the same except the second part tells what game it belongs to. This would likely need to do the autodetect between FoV and SS like the Exult menu does since most people will probably have the game blackgate really be forgeofvirtue (or whatever the default cfg name is).
Code: Select all
[translation]
french
[/translation]
The variable would default to no and if the proper file isn't read, then it would just read the usecode normally.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 4:53 pm
by artaxerxes
So here is an idea:
through a configuration entry, get Exult to load a file, crafted by translators in the format described earlier, so that whenever an access to the "data" part of a usecode function is made, it uses that file instead of usecode.
The difficulties I suppute are:
1) slow to load the text. After all, they used file offsets so that it's fast to load data. This solution would be a problem, unless we use offset value as index value in the flex file.
2) I don't know if this integrates well with current Exult designs, since I haven't followed much how patch dirs et al are accessed.
Artaxerxes
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 4:54 pm
by Malignant Manor
Code: Select all
but why even bother with the part after the underscore? The first part is enough and since this solution doesn't alter the usecode, there is no need to make it 32bits.
I was just citing the hex offsets. The first being when to start reading the txt file and the second being what point to start at when the text file is finished being read.
With your code, where would Exult know where to start reading again or would it be necessary? I have no qualms with using output offsets like you are using though.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 5:00 pm
by Malignant Manor
Also, scrap the data folder location option for translations since it would likely need patched fonts anyway.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 5:06 pm
by Malignant Manor
I'm unsure of if this solution would be affected by mods since I don't know how they patch the usecode.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 5:16 pm
by artaxerxes
using patched fonts is easy. Exult already supports that. If there is a fonts.vga in the patch directory, Exult will use it.
But it's true that in this case, you couldn't have say, a French and a Russian version together. You would have to separate them.
So what if you put everything (fonts.vga, text.flx, translated text) into "/patch/translation/fr-fr/". Or even forgo the "/patch" section altogether?
/translation/fr-fr/fonts.vga
/translation/fr-fr/text.flx
/translation/fr-fr/translation.flx
where translation would be a flex file, where each flex offset is made of "function name" <<4 + "offset value of phrase within function" and flex value is the translated string.
So, the phrase at offset 0000 for function 0x0401 would be (really) coded as:
04010000: '@Dupre...@'
Flex offset is a long int, Exult knows how to read flex files, so this solution should work. All that's needed is to instruct Exult to use that translation.flx file whenever grabbing data from "usecode".
Artaxerxes
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 5:22 pm
by jkchakkal
That's it. Are on track.
Perhaps the starting point will be put Tisane to create FLX
from the files extracted from the USECODE
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 5:32 pm
by Malignant Manor
I was talking about patched usecode. The offsets may change with a mods patched usecode with either example depending on how it is done. It would likely need to read the translation first and convert it. Then it would use the normal patch.
I'm not sure how your flx idea works or how easy it would be to edit. It's probably something I would have to see to understand.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 5:34 pm
by artaxerxes
that's not an issue Rodrigo and I can take care of that.
The only problem I'm having right now is size. Creating a flx file with index like '04010000' causes the creation of file several hundred of megabyte big, even though it's mostly empty.
Artaxerxes
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 5:47 pm
by artaxerxes
The whole issue revolves around performance.
When exult is executing usecode, it needs to have access to the data fast and possibly even pre-fetch and cache it.
That's why I mentioned flex files since Exult can not only read them, but they use an index to make lookups fast.
But I just tested creating a flex file with an index of 0x0401000 and trust me, it wasn't pretty.
It's almost as if Exult should use SQLite or something to store and load data (and why not configuration entries while we're at it?!).
All in all, that seems like a bad idea. Maybe it woud be better to explode the usecode and each function is its own flex with the index value being the offset value of the phrase being said.
So, 0x0401, line 0000 would be: 0401.flx, with a value of '@Dupre...@' at index 0000. Similarly, 0x02C1, line ED91 would be: 02C1.flx, with a value of '~ Some few candidates may [...]' at index ED91.
I tried an index of FFFF and the file stayed below 60k. Much better than the 577MB when using 04010000 as an index.
I understand your pb with mods, and the solution you offered might possibly work. However, even mods need to be translated anyways, so it would be better to give the user the option: mod or translation
Artaxerxes
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 5:52 pm
by jkchakkal
very well then Artaxerxes
then, as I understand it should be taking the dirt of the files and take only what you need.
then do the Exult understand that FLX is connected with the USECODE is easy! Correct?
577MB o.OOO
Meanwhile let's take a ride on Spark, (we're fans of the Ultima 7 chevrolet, lol)
http://www.pt.chevrolet-spark.eu/o-spar ... rente.html
joke!
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 5:56 pm
by Dominus
@Rodrigo, could you take your immediate translation questions to your thread? This thread is mainly for the devs (and people with code knowledge) on how to make translating easier.
I don't mean to be rude to you, but this way the thread keeps more on topic.
Thanks
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 6:07 pm
by jkchakkal
right, sorry.
is that we are all busy, hot-headed, and how I can help with the codes, I thought of spending a little happiness.
can not sit still and do nothing, just as you strive to make things happen. I can not help with code, but look forward with ideas.
I will then contain.
I do not want only to help my translation, but to be able to translate for everyone.
You are able to make it happen.
Look how far does Exult in so many platforms that it runs. Then you should know that many expect the success of the possibility of this tool.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 6:20 pm
by Malignant Manor
I tried an index of FFFF and the file stayed below 60k.
Barely.
so it would be better to give the user the option: mod or translation
The mods could be translated from source and combined with the appropriate translation. The mod source would have to be translated and create a usecode file.
It would be nice to get Marzo's opinion since he knows much better about this stuff since he wrote most of the softcoding.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 7:04 pm
by marzo
@Artaxerxes: The problem with flex files is the format; here is a quick summary:
Code: Select all
Header: 128 bytes
Name: 80 bytes
Magic number 1: 4 bytes
# of entries: 4 bytes
Magic number 2: 4 bytes
Padding: 9 bytes
Entry table:
For each entry in the flex file (see # entries), this comprises of two numbers:
Offset: 4 bytes
Entry size: 4 bytes
Entries themselves. Initgame.dat has each of them prefixed by a 13-byte name
So for a flex file with an entry of index 0x4010000 we would have: 128 (header) + 8 bytes * 0x04010000 (entry table; all the offsets and sizes would be zero for the empty indexes) = 537,395,328 bytes. That is without the entry itself.
Much better for this, in my opinion, would be to exploit the textmsg format Exult already uses for many data files (although in this case, more in line with the original intent). This would be:
Code: Select all
%%section sectionname
index:entry
%%endsection
This would be a plain text file. This file could be dumped by an easy-to-make tool from usecode (and a similar file for text.flx), put in a 'translations' dir -- equivalent to the patch dir in some ways -- and read into a std::map, std::string>. The std::pair key would be (function #,offset), sorted by function # first, then by offset; this would make access very quick (a std::map does binary searching). Exult could check for a translation language if any, and load from this map instead.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 7:42 pm
by artaxerxes
Marzo, what would it take to get Exult to look for a translation file and to use this std::map as you mention?
Also, I looked for a file format description for textmsg and couldn't find one. Could you give an example of a file using that format?
thx
Artaxerxes
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 8:48 pm
by marzo
Marzo, what would it take to get Exult to look for a translation file and to use this std::map as you mention?
Not too much, I hope. I am looking into something like it now, in fact.
Also, I looked for a file format description for textmsg and couldn't find one. Could you give an example of a file using that format?
There are many examples: in the svn trunk, there is data/exultmsg.txt, data/exultmsg_de.txt, exultmsg_fr.txt; and all the txt files in data/bg and data/si. The idea would be: Exult looks into the textmsg file for a section whose name is the current function number and loads the strings; each function has one text line per string in the usecode file and each such line has an index equal to the data offset of the string in the usecode function's data segment.
Re: Call for help with usecode
Posted: Wed Mar 03, 2010 9:18 pm
by artaxerxes
Not too much, I hope. I am looking into something like it
now, in fact.
Awesome!
There are many examples: in the svn trunk, there is
data/exultmsg.txt, data/exultmsg_de.txt, exultmsg_fr.txt; and
all the txt files in data/bg and data/si.
That's why I couldn't see it. data/exultmsg.txt is not following the format you mention. Neither are the _de and _fr versions. The text under bg/ and si/ do however, so now I know what you mean.
The idea would be:
Exult looks into the textmsg file for a section whose name is
the current function number and loads the strings; each
function has one text line per string in the usecode file and
each such line has an index equal to the data offset of the
string in the usecode function's data segment.
That's perfect. I can easily create tools to interact between this file format and a web front-end.
I think we're in the right track!
Thanks Marzo, Malignant Manor, Dominus and Rodrigo for getting the ball rolling!
Artaxerxes
Re: Call for help with usecode
Posted: Thu Mar 04, 2010 12:24 am
by jkchakkal
Hey Hey hey.. Artaxerxes...
very good!
I await anxiously for a new version of Tisane. Be updated?
I'll stay tuned on updates to continue with the project in Portuguese.
what can I expect?
Re: Call for help with usecode
Posted: Thu Mar 04, 2010 10:11 am
by The Ancient One
Here is the issue: the books contain so much text, that some functions *just* fits the 16 bit usecode (in particular the 0x02C1 function). Translated to French (which takes about 30% more space), it blows over. That's why Exult has supported 32 bit usecode for a while, using the '.ext32' heade
I'm not expert with these kind of technical problems, but I know that our tool was changed to support longer text and 32 bit usecode.
Anyway, we had this problem only in SI, and only with books.
Re: Call for help with usecode
Posted: Mon Mar 08, 2010 9:41 pm
by artaxerxes
I am making progress with Tisane. I've reworked the installer, so that it will create the (sqlite) DB for you, parse the usecode file found in the local directory to auto-populate the database, etc.
It'll looking really good, with very few requirements, actually only one: sqlite through PDO (and that can be changed easily).
Anyways, I'm in the process of populating the db even more (currently only records function names but not yet their data content). Once done, I'll have to modify my pages for translation (I'm a more experienced developer than I was when I did those, so I have a few new things I could try) and I'll be ready to publish the code so that _ANYONE_ can translate U7 into their language even with no computer experience, through the web, as a team.
Marzo, paying all due respect to your time, did you have a chance to look at this textmsg support for translations? Can I help in any way?
thx
Artaxerxes
Re: Call for help with usecode
Posted: Mon Mar 08, 2010 9:50 pm
by Dominus
@Artaxerxes, currently we have some kind of feature freeze, so this whole "making translations" easier might be on the to-do-list-for-after.
I really think you should make afeature request in the tracker, pointing at this thread, so it doesn't fall off the radar once it's not on the first forum page anymore
Re: Call for help with usecode
Posted: Mon Mar 08, 2010 10:27 pm
by artaxerxes
@Dominus:
wil do.
@all:
In the meantime, I've successfully parsed the usecode (function name + data) and populated the DB, straight from PHP. It's fast: about 2.22s to do it all on my machine. Not much is left to do, beside cleaning up the pages on which the translation happen. I also have to write the export utility.
Artaxerxes
Re: Call for help with usecode
Posted: Tue Mar 09, 2010 12:49 pm
by jkchakkal
Dominus, Artaxerxes, The Ancient One, Marzo...
Thanks for the feedback really helpful.
I look forward to the new version of Tisane, although it is not charging anything, I just want to show that its success is being awaited.
In time, I tell you it would be interesting to have the function to export to txt, if possible...
to be an easy way to open on any computer. Even without the Internet, if it is of interest to someone.
Re: Call for help with usecode
Posted: Tue Mar 09, 2010 2:24 pm
by Dominus
Rodrigo, you are helpful as well, without your call for help nothing would have changed
Re: Call for help with usecode
Posted: Thu Mar 11, 2010 6:32 pm
by artaxerxes
Marzo,
I'm virtually done with Tisane. All I have to do is the help page and the export function.
So, if I understand correctly, the format would be looking like (for SI):
Code: Select all
%%section usecode
0096,0000:@The sails must be furled before the planks are raised.@
0096,0039:@I believe the gangplank is blocked.@
...
%%endsection
If that's the case, let me know so I can create this export from Tisane.
thx
Artaxerxes