François Patte
2008-03-10 22:27:13 UTC
bonsoir,
I am trying to convert a pdf file into html using pdftohtml provided by f8.
I get an html file with "nice" characters like: ??? insead of apostroph,
or ?? instead of ?...
so i think that there is some coding problem.
Using man pdftohtml, I got this info:
- -enc <string>
~ output text encoding name
but, I am unable to guess what is the syntax to use in order to have a
correct output in utf8 for:
Error: Couldn't find unicodeMap file for the 'utf8' encoding
is the only answer I get if I try:
pdftohtml -enc utf8 myfile.pdf
i tried utf-8, latin1, latin-1, ISO_8859-1, .... without any success.
If somebody knows... many thnaks in advance.
- --
Fran?ois Patte
UFR de math?matiques et informatique
Universit? Paris Descartes
45, rue des Saints P?res
F-75270 Paris Cedex 06
T?l. +33 (0)1 44 55 35 61
http://www.math-info.univ-paris5.fr/~patte
I am trying to convert a pdf file into html using pdftohtml provided by f8.
I get an html file with "nice" characters like: ??? insead of apostroph,
or ?? instead of ?...
so i think that there is some coding problem.
Using man pdftohtml, I got this info:
- -enc <string>
~ output text encoding name
but, I am unable to guess what is the syntax to use in order to have a
correct output in utf8 for:
Error: Couldn't find unicodeMap file for the 'utf8' encoding
is the only answer I get if I try:
pdftohtml -enc utf8 myfile.pdf
i tried utf-8, latin1, latin-1, ISO_8859-1, .... without any success.
If somebody knows... many thnaks in advance.
- --
Fran?ois Patte
UFR de math?matiques et informatique
Universit? Paris Descartes
45, rue des Saints P?res
F-75270 Paris Cedex 06
T?l. +33 (0)1 44 55 35 61
http://www.math-info.univ-paris5.fr/~patte