Page 1 of 1

Weird engine problem

Posted: Mon May 26, 2003 11:27 am
by artaxerxes
Hi,

the problem occurs when certain characters are used in a certain order. Basically, some text does not show !

I have translated Thoxa, so that she says:
"Xenka appelle notre ère le Temps du Déséquilibre".
The resulting text is the following:
"Xenka appelle notre re le Temps du Déséquilibre".

If I put:
"Xenka appelle notre époque-ère le Temps du Déséquilibre".
Then it shows ok.

If I put:
"Xenka appelle notre -ère le Temps du Déséquilibre".
It shows ok too.

But if I put:
"Xenka appelle notre ère le Temps du Déséquilibre".
It shows:
"Xenka appelle notre re le Temps du Déséquilibre".

Very strange indeed. Any clue on how to get it fixed?

Artaxerxes

Re: Weird engine problem

Posted: Mon May 26, 2003 12:13 pm
by artaxerxes
just for the kick, I changed Dupre's conversation so that in the same phrase (when you ask him to leave when you just start a new game), he says:

"ère ière".

And as you might guess, he in fact says:
" re ière".

So it seems I cannot start a word with the letter "è".

Let me try with another word and I'll let you know.

Artaxerxes

Re: Weird engine problem

Posted: Mon May 26, 2003 12:17 pm
by artaxerxes
indeed, a word cannot start with è with Exult. It can however start with other accentuated letters, like à and é.

Artaxerxes

Re: Weird engine problem

Posted: Mon May 26, 2003 12:29 pm
by wjp
What's the internal hex code for è?

Re: Weird engine problem

Posted: Tue May 27, 2003 3:15 am
by artaxerxes
è is 0xE8
é is 0xE9

just to say it one more time, é shows at the beginning of a word but not è.

Artaxerxes

Re: Weird engine problem

Posted: Tue May 27, 2003 4:02 am
by artaxerxes
oups

my mistake, forget what I said!
è becomes 0x0C
é becomes 0x03

Artaxerxes

Re: Weird engine problem

Posted: Tue May 27, 2003 4:42 am
by wjp
Sounds like we're considering è a space in some places and a word character in others.

Re: Weird engine problem

Posted: Tue May 27, 2003 4:52 am
by artaxerxes
what I'll do is make someone say all the accentuated letters as a first (or only) character and I'll report which ones fail to show.

Artaxerxes

Re: Weird engine problem

Posted: Tue May 27, 2003 5:03 am
by artaxerxes
Codes 0x09 and 0x0A do not show when they are first letter in a word.


Artaxerxes

Re: Weird engine problem

Posted: Tue May 27, 2003 5:41 am
by Darke
That's probably because 0x09 is an ascii tab ('\t') and 0x0A is an ascii line feed ('\n'), both technically considerd whitespace (which is probably tested by isspace() or something).

0x03 isn't a printable character, so might be being filtered out by any isprint() tests.

Re: Weird engine problem

Posted: Tue May 27, 2003 7:07 am
by drcode
And 0x0c is a carriage-return. Perhaps we're treating that as whitespace too. I'd try to trace (in gdb) from the spot where the string's about to be printed.

Re: Weird engine problem

Posted: Tue May 27, 2003 7:28 am
by artaxerxes
there is something seriously wrong with me today.

The correct codes that do now show up when first letter of a word are:

ë: 0x0B
è: 0x0C


My apology for being totally messed up in the head!

Artaxerxes

Re: Weird engine problem

Posted: Tue May 27, 2003 7:44 am
by wjp
DrCode: 0x0D is a carriage return

Re: Weird engine problem

Posted: Tue May 27, 2003 1:21 pm
by Darke
Quick references. From man ascii:
Oct Dec Hex Char
--------------------------
000 0 00 NUL '\0'
001 1 01 SOH
002 2 02 STX
003 3 03 ETX
004 4 04 EOT
005 5 05 ENQ
006 6 06 ACK
007 7 07 BEL '\a'
010 8 08 BS '\b'
011 9 09 HT '\t'
012 10 0A LF '\n'
013 11 0B VT '\v'
014 12 0C FF '\f'
015 13 0D CR '\r'
016 14 0E SO
017 15 0F SI

From man isspace():
isspace()
checks for white-space characters. In the "C" and "POSIX"
locales, these are: space, form-feed ('\f'), newline ('\n'),
carriage return ('\r'), horizontal tab ('\t'), and vertical tab
('\v').

From that, 0x0C ('\f''), 0x09 ('\t') and 0x0A ('\n') are all defined as 'spaces' so that's probably why they're being cut off.

Not that I've actually looked at the code, must rush off to work. *grin*

Re: Weird engine problem

Posted: Tue May 27, 2003 1:56 pm
by drcode
Erp. Right. 0x0c is a FF (formfeed), and 0x0b is a VT (vertical tab), according to this table I found. I wonder if forgetting ASCII codes is the first sign of senility:-)

Re: Weird engine problem

Posted: Tue May 27, 2003 6:11 pm
by fliptw
nah, its the first sign of Unicode dementia.

Re: Weird engine problem

Posted: Thu May 29, 2003 8:14 pm
by wjp
Which low-ascii characters are used for french exactly? Seems we'll have to do a custom isspace function.

Re: Weird engine problem

Posted: Fri May 30, 2003 3:50 am
by artaxerxes
Here are all the codes transformed thanks to my perl program to convert accentuated letters to Ultima-friendly letters.
Taken from CVS

Ç: 0x01
ü: 0x02
é: 0x03
â: 0x04
ä: 0x05 -- not used in french AFAIK
à: 0x06
ç: 0x07
ê: 0x08
ë: 0x0B
è: 0x0C
ï: 0x0E
î: 0x0F
È: 0x10
À: 0x11
É: 0x12
ô: 0x13
ö: 0x14 -- not used in french AFAIK
û: 0x15
ù: 0x16
Ô: 0x17
Ê: 0x18
Î: 0x19
ß: 0x1C -- not used in french AFAIK
í: 0x1D -- not used in french AFAIK
ó: 0x1E -- not used in french AFAIK
ú: 0x1F -- not used in french AFAIK

Artaxerxes

Re: Weird engine problem

Posted: Fri May 30, 2003 4:47 am
by wjp
Ok, good, they skip 0x09, 0x0A, 0x0D. That helps :-)