ANSI -> UT8 Conversion

Utilities like DBU, Make, IDE written in HMG/ used to create HMG based applications

Moderator: Rathinagiri

User avatar
esgici
Posts: 4543
Joined: Wed Jul 30, 2008 9:17 pm
DBs Used: DBF
Location: iskenderun / Turkiye
Contact:

ANSI -> UT8 Conversion

Post by esgici »

Hi All

After OEM->ANSI, now we are in another migration ANSI -> Unicode.

We have string based conversion functions, but as far as I know not one file based.

I guess many people can't start that migration due to difficulty for convert program source files manually.

I hope this small program ( either considerable as an utility or not ) will be useful to our friends.

As always, please don't forgive my faults; any suggestion, correction, bug report are welcome :arrow:
Screen shoot of CFAN2UT8 program
Screen shoot of CFAN2UT8 program
CFANS2UT8.jpg (9.3 KiB) Viewed 6504 times
CFANS2UT8(src).zip
Source files for CFANS2UT8 program
(3.01 KiB) Downloaded 463 times
CFANS2UT8(exe).zip
Executable for CFANS2UT8 program
(1.17 MiB) Downloaded 433 times
Viva HMG :D
Viva INTERNATIONAL HMG :D
Javier Tovar
Posts: 1275
Joined: Tue Sep 03, 2013 4:22 am
Location: Tecámac, México

Re: ANSI -> UT8 Conversion

Post by Javier Tovar »

Gracias Sr. Esgici por compartir.
Saludos
////////////////////////////////////////////////////////////
Thanks for sharing Mr. Esgici.
regards
User avatar
serge_girard
Posts: 3165
Joined: Sun Nov 25, 2012 2:44 pm
DBs Used: 1 MySQL - MariaDB
2 DBF
Location: Belgium
Contact:

Re: ANSI -> UT8 Conversion

Post by serge_girard »

Thanks Esgici !

I'm busy converting 242Kb PRG files (12) to UTF-8.

I have a few suggestions:

1) Please make backup before conversion.
2) Progress of files + progress of lines instead of only individual lines.

Greetings and I will let you know the results!

Thanks, Serge
There's nothing you can do that can't be done...
User avatar
serge_girard
Posts: 3165
Joined: Sun Nov 25, 2012 2:44 pm
DBs Used: 1 MySQL - MariaDB
2 DBF
Location: Belgium
Contact:

Re: ANSI -> UT8 Conversion

Post by serge_girard »

Esgici,


Conversion went well (a bit slow).

I did a file compare and all converted (=new) have 3 bytes extra at the very beginning: ef bb bf (= BOM)
and at the end an extra 0d 0a (= CRLF)

Code: Select all

20131212 09:22:18

Folder : P:\hmg.3.2\KEMP

ANALYSE.TXT : 101 lines,  1,757 bytes converted to UT8 format in 1,762  bytes.
BH_DOC.Prg : 534 lines,  12,841 bytes converted to UT8 format in 12,846  bytes.
BH_DOWNLOADS.Prg : 302 lines,  7,357 bytes converted to UT8 format in 7,360  bytes.
BH_EXE.Prg : 415 lines,  10,444 bytes converted to UT8 format in 10,449  bytes.
BH_PROG_AUTH.Prg : 677 lines,  18,935 bytes converted to UT8 format in 18,940  bytes.
bh_proj.Prg : 1,564 lines,  42,694 bytes converted to UT8 format in 42,699  bytes.
BH_TEXT.Prg : 571 lines,  13,446 bytes converted to UT8 format in 13,449  bytes.
BH_USERS.Prg : 1,193 lines,  36,291 bytes converted to UT8 format in 36,346  bytes.
HALLOCKS.PRG : 526 lines,  10,236 bytes converted to UT8 format in 10,239  bytes.
INIT.PRG : 632 lines,  16,252 bytes converted to UT8 format in 16,255  bytes.
KEMP.PRG : 2,486 lines,  65,370 bytes converted to UT8 format in 65,378  bytes.
KEMP_HELP.PRG : 101 lines,  1,827 bytes converted to UT8 format in 1,832  bytes.
KEMP_SETUP.prg : 540 lines,  12,338 bytes converted to UT8 format in 12,341  bytes.

20131212 10:09:13
So all looks very good. Later I will try compilation in HMG3.2; I will let you know.

Greetings, Serge
There's nothing you can do that can't be done...
User avatar
esgici
Posts: 4543
Joined: Wed Jul 30, 2008 9:17 pm
DBs Used: DBF
Location: iskenderun / Turkiye
Contact:

Re: ANSI -> UT8 Conversion

Post by esgici »

Thanks to interested :)

Serge:

Sadly for now I haven't enough time to deal extra features :(

For some probable future works please give me a road map for backing up: what kind of backup will be better,
  • rename original files
    move original files to a separate folder
    compress original files
    ... etc
and please think naming issues when repeating process.

If only difference is BOM between two format, this means ANSI file don't include foreign ( non-English ) characters.

Cause of last extra CRLF may be different; anyway this isn't an important problem, I think

Anyway thanks to interest and nice words :)

Viva INTERNATIONAL HMG :D
Viva INTERNATIONAL HMG :D
User avatar
srvet_claudio
Posts: 2193
Joined: Thu Feb 25, 2010 8:43 pm
Location: Uruguay
Contact:

Re: ANSI -> UT8 Conversion

Post by srvet_claudio »

esgici wrote:Hi All

After OEM->ANSI, now we are in another migration ANSI -> Unicode.

We have string based conversion functions, but as far as I know not one file based.

I guess many people can't start that migration due to difficulty for convert program source files manually.

I hope this small program ( either considerable as an utility or not ) will be useful to our friends.

As always, please don't forgive my faults; any suggestion, correction, bug report are welcome :arrow:

Viva HMG :D
Very Nice Friend!!!
Best regards.
Dr. Claudio Soto
(from Uruguay)
http://srvet.blogspot.com
User avatar
mustafa
Posts: 1158
Joined: Fri Mar 20, 2009 11:38 am
DBs Used: DBF
Location: Alicante - Spain
Contact:

Re: ANSI -> UT8 Conversion

Post by mustafa »

Hola amigos:
En primer lugar felicidades a Esgici
por tu programa.

Los Viejos "Dinosaurios" que procedemos de Summer87
de Clipper y dBfast, reconozco que nos cuesta el
reciclaje, yo personalmente casi nunca huso el IDE, ni los
ficheros FMG, siempre he escrito los programas con Notepad
por defecto guarda con ANSI y UTF-8 estoy haciendo pruebas
con la nueva versión de HMG 3.2
un fichero ANSI no refleja compilado correctamente los
caracteres "& Ñ ñ € % $ @ #" si lo reconvierto a UTF-8
si me salen correcto todos.

Sin poner SET CODEPAGE TO SPANICH

Menos el ---> & no se si hay
que usar CHR(068), sigue sin salir nada.


Esgici indica "UT8 with BOM" que en el Notepad no está
solo UTF-8 es lo mismo ?

Tengo que reciclar todos los mis códigos fuentes de ANSI
a UT8 with BOM ? en Notepad++ si que vi la opción:
Encode in UTF-8 Without BOM ó en Encode in UTF-8
El mismo fichero en ANSI Notepad --------------> 1994 bytes
UTF-8 Esgici --------------> 2003 bytes
UTF-8 Notepad -------------> 2002 bytes

Perdonad mi ignorancia pero en este tema, por mucho que
he leido todos los Post no entiendo si para que los
nuevos códigos fuentes para que funcionen correctamente
compilados con HMG 3.2 hay que Guardar como UTF-8

Gracias , un saludo
Mustafa :cry:

*-------------------------------------------------------------*
Hello friends :
Firstly congratulations to Esgici
for your program.

The "Dinosaurs " Old who come from Summer87
Clipper and DBFAST , we recognize that the costs
recycling , I personally almost never All the IDE , nor
FMG files , programs have always written with Notepad
default saved with ANSI and UTF -8 I'm doing tests
with the new version 3.2 of HMG
an ANSI file not compiled correctly reflects the
characters "& Ñ ñ € % $ @ #" if reconvierto to UTF-8
if I go all right .

Without calling TO SET CODEPAGE SPANICH

Less --- > & if not
to use CHR ( 068) , still not out anything.

Esgici indicates " UT8 with BOM " in the Notepad is not
only UTF- 8 is the same ?

I have to recycle all my source codes of ANSI
UT8 with a BOM ? in Notepad + + if I saw the option :
Encode in UTF -8 Without BOM or Encode in UTF -8
The same file in ANSI Notepad -------------- > 1994 bytes
                     UTF -8 Esgici -------------- > 2003 bytes
                     UTF -8 Notepad ------------- > 2002 bytes

Forgive my ignorance on this subject but , much as
I read all posts so that I do not understand if the
new source codes to work properly
HMG compiled with 3.2 should save as UTF -8

Thanks , a greeting
Mustafa :cry:
User avatar
mol
Posts: 3720
Joined: Thu Sep 11, 2008 5:31 am
Location: Myszków, Poland
Contact:

Re: ANSI -> UT8 Conversion

Post by mol »

Clipper and harbour adds chr(26) (EOF) to the end of file, It's the reason of different lengths of result files.
User avatar
mustafa
Posts: 1158
Joined: Fri Mar 20, 2009 11:38 am
DBs Used: DBF
Location: Alicante - Spain
Contact:

Re: ANSI -> UT8 Conversion

Post by mustafa »

Hola Mol
el código ----> &
creo que es CHR(038)
tampoco sale nada, fichero guardado en UTF-8
gracias
Mustafa
*--------------------------------------*
Hello Mol
code ----> &
I think it's CHR (038)
not miss anything, save file in UTF-8
thanks
Mustafa
User avatar
mustafa
Posts: 1158
Joined: Fri Mar 20, 2009 11:38 am
DBs Used: DBF
Location: Alicante - Spain
Contact:

Re: ANSI -> UT8 Conversion

Post by mustafa »

Hola Mol
Curiosamente si pones:
@ 210,100 LABEL Label_c VALUE "ampersand "+ chr(038) WIDTH 290 HEIGHT 25 FONT "ARIAL" SIZE 14
solo sale ------------> ampersand , no sale simbolo &
pero si pones:
@ 310,100 LABEL Label_d VALUE "ampersand "+"&" + chr(038) WIDTH 290 HEIGHT 25 FONT "ARIAL" SIZE 14
sale correcto --------> ampersand &
Guardado fichero con UTF-8
Curioso
Mustafa
*-------------------------------------------------*
Hello Mol
Interestingly if you put:
@ 210,100 Label_c LABEL VALUE "ampersand" + chr (038) 290 HEIGHT 25 WIDTH FONT "ARIAL" SIZE 14
only goes ------------> ampersand, no sale symbol &
but if you put:
@ 310.100 Label_d LABEL VALUE "ampersand" + "&" + chr (038) 290 HEIGHT 25 WIDTH FONT "ARIAL" SIZE 14
goes right ampersand &
save file with UTF-8
curious
Mustafa
Post Reply