中文XML论坛--浅谈PDFlib中文输出

姓名：(无权查看)
城市：(无权查看)
院校：(无权查看)

浅谈PDFlib中文输出（一）
作者:Michelle Yi 时间:2005-11-16 11:35 出处:互连网责编:66

              摘要：如何使用Acrobat标准的简体中文字体

PDF文件格式以其安全可靠，易于交换，及保真度高而成为电子文档的标准。PDFlib是一套在国际上非常流行的在服务器端批量生成PDF文档的功能强大的软件包。国外许多政府，税务，银行，水电，邮电部门用其在线生成PDF格式的单据及报表。

对于国内用户来说，如何使用PDFlib输出简体中文会是我们最关心的问题。在这里我将于大家一起分享自己的一些心得体会，不对之处请指正，若我所说于PDFlib手册有冲突，请以手册为准。我的邮箱是 :bowriver2001@yahoo.ca 。

对于没有接触过PDFlib的朋友，如果你们感兴趣，可以从这个链接http://www.pdflib.com/products/pdflib/download/index.html 下载PDFlib软件包。(也可以到VC知识库工具与资源栏目下载) 在没有license的情况下，你仍可使用其所有功能，只是生成的PDF文档带有PDFlib的水印。

PDFlib提供C,C++, Java, Perl, PHP, Python, Tcl 及RealBasic的语言接口。以下所有的例子将采用C。
如何使用Acrobat 标准的简体中文字体

PDFlib自带STSong-Light，AdobeSongStd-Light-Acro，及STSongStd-Light-Acro三种简体中文字体。这三种字体同时也是Acrobat的简体中文标准字体。
以上三种字体均支持以下几种编码（Encoding）：UniGB-UCS2-H，UniGB-UCS2-V，UniGB-UTF16-H，UniGB-UTF16-V，GB-EUC-H，GB-EUC-V，GBpc-EUC-H，GBpc-EUC-V，GBK-EUC-H，GBK-EUC-V，GBKp-EUC-H，GBKp-EUC-V，GBK2K-H，及GBK2K-V。各编码的定义请见下表1.1:

表1.1

Encoding Character set and text format
UniGB-UCS2-H
UniGB-UCS2-V Unicode (UCS-2) encoding for the Adobe-GB1 character collection
UniGB-UTF16-H
UniGB-UTF16-V Unicode (UTF-16BE) encoding for the Adobe-GB1 character collection.Contains mappings for all characters in the GB18030-2000 character set.
GB-EUC-H
GB-EUC-V Microsoft Code Page 936 (charset 134), GB 2312-80 character set, EUC-CN encoding
GBpc-EUC-H
GBpc-EUC-V Macintosh, GB 2312-80 character set, EUC-CN encoding, Script Managercode 2
GBK-EUC-H
GBK-EUC-V Microsoft Code Page 936 (charset 134), GBK character set, GBK encoding
GBKp-EUC-H
GBKp-EUC-V Same as GBK-EUC-H, but replaces half-width Latin characters withproportional forms and maps code 0x24 to dollar ($) instead of yuan (￥).
GBK2K-H
GBK2K-V GB 18030-2000 character set, mixed 1-, 2-, and 4-byte encoding

编码以-H结尾的，表示字体将会横向输出；以 –V结尾的，表示字体将会纵向输出。以Uni开头的是Unicode类编码，如果你的输入字符串是Unicode，则应选择此类编码。以GB开头的是CP936类编码，如果你的输入字符串是Code Page 936，则应选择此类编码。
在PDFlib中若想调用以上其中一种字体，可直接用字体名和相应的编码： int Font_CS;
Font_CS = PDF_load_font(p, " STSong-Light ", 0, " ", " UniGB-UTF16-H");

不久，你们将会发现，字体与编码间可有非常多的组合，而PDFlib的字体功能（function）并不支持所有的组合。最为保险的组合是PDFlib自带三种字体与Unicode类编码的组合。
下面是一个使用PDFlib自带字体及编码的C 源程序 /*******************************************************************/
/* This example demostrates the usage of PDFlib builtin fonts
/* based on Chinese Simplifed Windows.
/*******************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "pdflib.h"
int main(void)
{
PDF          *p = NULL;
int         i = 0, j = 0, Left = 50, Top = 800;
int        Font_E = 0, Font_CS = 0;
char        TextUnicode[] = "\x80\x7B\x53\x4F\x2D\x4E\x87\x65";
char         TextCp936[] = "\xBC\xF2\xCC\xE5\xD6\xD0\xCE\xC4";
char         EncodingName[100];
static const char *ChineseFont[] =
{"STSong-Light",  "AdobeSongStd-Light-Acro",   "STSongStd-Light-Acro", };
static const char *Encoding[] =
{
"UniGB-UCS2-H",
"UniGB-UCS2-V",
"UniGB-UTF16-H",
"UniGB-UTF16-V",
"GB-EUC-H",
"GB-EUC-V",
"GBpc-EUC-H",
"GBpc-EUC-V",
"GBK-EUC-H",
"GBK-EUC-V",
"GBKp-EUC-H",
"GBKp-EUC-V",
"GBK2K-H",
"GBK2K-V",
};
const int   fsize = sizeof ChineseFont / sizeof (char *);
const int   esize = sizeof Encoding / sizeof (char *);
/* create a new PDFlib object */
if ((p = PDF_new()) == (PDF *) 0)
{
printf("Couldn't create PDFlib object (out of memory)!\n");
return(2);
}
PDF_TRY(p) {
if (PDF_begin_document(p, "pdflib_cs1.pdf", 0, "") == -1)
{
printf("Error: %s\n", PDF_get_errmsg(p));
return(2);
}
PDF_set_info(p, "Creator", "pdflib_cs1.c");
PDF_set_info(p, "Author", "bowriver2001@yahoo.ca");
PDF_set_info(p, "Title", "Output Chinese Simplify with PDFlib builtin font");
Font_E = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", "");
for (i = 0; i < fsize; i++)
{
/*Start a new page. */
Top = 800;
PDF_begin_page_ext(p, a4_width, a4_height, "");
PDF_setfont(p, Font_E, 24);
PDF_show_xy(p, ChineseFont[i] , Left + 50,  Top);
Top -= 30;
for (j = 0; j < esize; j++)
{
Font_CS = PDF_load_font(p, ChineseFont[i], 0, Encoding[j], "");
PDF_setfont(p, Font_E, 12);
strcpy(EncodingName, "");
strcat(EncodingName,  Encoding[j]);
strcat(EncodingName,  ":");
PDF_show_xy(p, EncodingName , Left,  Top);
PDF_setfont(p, Font_CS, 12);
if (strstr(Encoding[j], "-H") != NULL)
{
/* It's horizontal encoding. */
Top -= 15;
}
if (strstr(Encoding[j], "UniGB") != NULL)
{
/* It's unicode encoding. */
PDF_show_xy2(p, TextUnicode, 8, Left,  Top);
}
else
{
/* It's code page 936 encoding. */
PDF_show_xy2(p, TextCp936, 8, Left,  Top);
}
if (strstr(Encoding[j], "-H") != NULL)
{
/* It's horizontal encoding. */
Top -= 25;
}
else
{
/* It's vertical encoding. */
Top -= 65;
}
} /* for */
/* End of page. */
PDF_end_page_ext(p, "");
} /* for */
PDF_end_document(p, "");
}
PDF_CATCH(p) {
printf("PDFlib exception occurred in pdflib_cs1 sample:\n");
printf("[%d] %s: %s\n",
PDF_get_errnum(p), PDF_get_apiname(p), PDF_get_errmsg(p));
PDF_delete(p);
return(2);
}
PDF_delete(p);
return 0;
}

姓名：(无权查看)
城市：(无权查看)
院校：(无权查看)

浅谈PDFlib中文输出（二）
除了PDFlib自带字体外，用户还可以使用安装在系统上的字体及其他用户字体。

PDFlib称安装在Windows和Mac操作系统中的（存在于或被拷入相应系统字体目录的）TrueType, OpenType 和PostScript字体为宿主字体（Host Font）。PDFlib可直接引用字体名进行调用,但必须与文件名完全相同(严格区分大小写)。例如，调用安装在Windows系统中的字体：C:\WINDOWS\Fonts\SimHei.ttf int Font_CS = 0;
Font_CS = PDF_load_font(p, "SimHei", 0, "unicode", "");

需要注意的是，字体名可能与字体文件名不同，甚至相同的字体在不同语言的操作系统下字体名称会有所不同。在 Windows 环境下查看字体名，可双击该字体文件，窗口打开后的第一行字除结尾的 TrueType, OpenType 外为字体名。例如，调用安装在 Windows 系统中的字体： C:\WINDOWS\Fonts\SimHei.ttf ，双击该文件后，窗口的第一行为“黑体 TrueType” 。则该文件的字体名为“黑体”。在 PDFlib 中若要调用多字节的文件名，须以 BOM+ UTF8 的形式。 “黑体”的 BOM+ UTF8 的形式为“ \xEF\xBB\xBF\xE9\xBB\x91\xE4\xBD\x93 ”。
因此对于中文黑体, 在中文WINDOWS下,则我们使用
PDF_load_font(p, "\xEF\xBB\xBF\xE9\xBB\x91\xE4\xBD\x93", 0, "unicode", "");
在英文WINDOWS下则应使用
PDF_load_font(p, "SimHei", 0, "unicode", "");
(小技巧: 我们可以使用Windows2000/XP自带的notepad获得UTF8编码,具体方法举例:在notepad中输入"黑体"并保存, 保存时在编码下拉框中选择UTF-8, 然后用UltraEdit,WinHex,VC等可以进行二进制编辑的工具打开该文件即可取得带BOM的UTF8字符串)

除安装在Windows系统中的字体之外，PDFlib还可以调用其他用户字体。但在调用之时，需要给出路径名。如我想用C:\Program Files\Adobe\Acrobat 7.0\Resource\CIDFont\AdobeSongStd-Light.otf 这个字体： Font_CS= PDF_load_font(p,
"C:\\Program Files\\Adobe\\Acrobat 7.0\\Resource\\CIDFont\\ AdobeSongStd-Light",
0, "unicode", "");
但这里有个例外，那就是.ttc(TrueType Collection)字体。.ttc是集合字体文件，每个文件中含有多种字体。所以用户不能用文件名调用字体，而是要用真正的字体名。比方说，我们知道C:\WINDOWS\Fonts\MSGOTHIC.TTC 包含三种字体依次名为MS Gothic，MS PGothic，和MS UI Gothic。我们可以用以它们相应的字体名调用： int Font_E = 0;
Font_E= PDF_load_font(p, "MS Gothic", 0, "winansi", ""); /* Use MS Gothic */
PDF_setfont(p, Font_E, 20);
PDF_show_xy(p, "MS Gothic font:" , 50, 800)；
Font_E= PDF_load_font(p, "MS PGothic", 0, "winansi", ""); /* Use MS PGothic */
……
Font_E= PDF_load_font(p, "MS UI Gothic", 0, "winansi", ""); /* Use MS UI Gothic */

可是我们经常并不清楚.ttc里包含哪些字体。在这种情况PDFlib提供了另一种调用方式—索引（Index）。用此方式，首先须给字体文件名一个别名，然后在别名后加冒号再加数字（0表示文件中第一种字体，1 表示第二种，依次类推。） int Font_E = 0;
/* Give “C:\WINDOWS\Fonts\MSGOTHIC.TTC an alias “gothic” */
PDF_set_parameter(p, "FontOutline", "gothic=C:\\WINDOWS\\Fonts\\MSGOTHIC.TTC");
Font_E= PDF_load_font(p, "gothic:0", 0, "winansi", ""); /* Use MS Gothic */
Font_E= PDF_load_font(p, "gothic:1", 0, "winansi", ""); /* Use MS PGothic */
Font_E= PDF_load_font(p, "gothic:2", 0, "winansi", ""); /* Use MS UI Gothic */

下面是一个相关的例子--C 源程序 /*******************************************************************/
/* This example demostrates the usage of host font and other fonts
/* based on Chinese Simplifed Windows.
/*******************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "pdflib.h"
int main(void)
{
PDF         *p = NULL;
int             i = 0, j = 0, Left = 50, Top = 800;
int             Font_E = 0, Font_CS = 0;
char          fontfile[1024];
char          buf[1024];
char          TextUnicode[] = "\x80\x7B\x53\x4F\x2D\x4E\x87\x65";
/* create a new PDFlib object */
if ((p = PDF_new()) == (PDF *) 0)
{
printf("Couldn't create PDFlib object (out of memory)!\n");
return(2);
}
PDF_TRY(p) {
if (PDF_begin_document(p, "pdflib_cs2.pdf", 0, "") == -1)
{
printf("Error: %s\n", PDF_get_errmsg(p));
return(2);
}
PDF_set_info(p, "Creator", "pdflib_cs2.c");
PDF_set_info(p, "Author", "myi@pdflib.com");
PDF_set_info(p, "Title", "Output Chinese Simplify with host font and others");
/* Start a new page. */
PDF_begin_page_ext(p, a4_width, a4_height, "");
Font_E = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", "");
/* Using host font -- C:\WINDOWS\Fonts\SimHei.ttf.
PDFlib is using BOM UTF8 string to calling multi-byte character string
SimHei.ttf font name is "黑体", its corresponding BOM UTF8 string is
"\xEF\xBB\xBF\xE9\xBB\x91\xE4\xBD\x93" */
Font_CS= PDF_load_font(p, "\xEF\xBB\xBF\xE9\xBB\x91\xE4\xBD\x93", 0, "unicode", "");
/* Font_CS= PDF_load_font(p, "SimHei", 0, "unicode", ""); */
PDF_setfont(p, Font_E, 20);
PDF_show_xy(p, "SimHei font:" , Left,  Top);
PDF_setfont(p, Font_CS, 24);
Top-=30;
PDF_show_xy(p, TextUnicode , Left,  Top);
/* Using other disk-based font file that is not installed in system directory --
C:\PSFONTS\CS\gkai00mp.ttf*/
Top-=50;
strcpy(fontfile, "C:\\PSFONTS\\CS\\gkai00mp.ttf");
sprintf(buf, "kai=%s", fontfile);
/* Defines kai as alias for ..\gkai00mp.ttf */
PDF_set_parameter(p, "FontOutline", buf);
Font_CS= PDF_load_font(p, "kai", 0, "unicode", "");
PDF_setfont(p, Font_E, 20);
PDF_show_xy(p, "AR PL KaitiM GB  font:" , Left,  Top);
PDF_setfont(p, Font_CS, 24);
Top-=30;
PDF_show_xy(p, TextUnicode , Left,  Top);
/* Using TrueType collection font with index -- C:\WINDOWS\Fonts\simsun.ttc*/
Top-=50;
strcpy(fontfile, "C:\\WINDOWS\\Fonts\\simsun.ttc");
sprintf(buf, "simsun=%s", fontfile);
/* Defines AdobeSongStd as alias for ..\AdobeSongStd-Light.otf
This only need to claim once will be sufficient to
configure all fonts in simsun.ttc*/
PDF_set_parameter(p, "FontOutline", buf);
/* TTC files contain multiple separate fonts.
Address 1st font by appending a colon character and 0 after alias simsun */
Font_CS= PDF_load_font(p, "simsun:0", 0, "unicode", "");
PDF_setfont(p, Font_E, 20);
PDF_show_xy(p, "simsun:0 font:", Left, Top);
PDF_setfont(p, Font_CS, 24);
Top-=30;
PDF_show_xy2(p, TextUnicode, 8, Left,  Top);
/*Address 2nd font by appending a colon character and 1 after alias simsun */
Top-=50;
Font_CS= PDF_load_font(p, "simsun:1", 0, "unicode", "");
PDF_setfont(p, Font_E, 20);
PDF_show_xy(p, "simsun:1 font:", Left, Top);
PDF_setfont(p, Font_CS, 24);
Top-=30;
PDF_show_xy2(p, TextUnicode, 8, Left,  Top);
/* End of page. */
PDF_end_page_ext(p, "");
PDF_end_document(p, "");
}
PDF_CATCH(p) {
printf("PDFlib exception occurred in pdflib_cs2 sample:\n");
printf("[%d] %s: %s\n",
PDF_get_errnum(p), PDF_get_apiname(p), PDF_get_errmsg(p));
PDF_delete(p);
return(2);
}
PDF_delete(p);
return 0;
}

姓名：(无权查看)
城市：(无权查看)
院校：(无权查看)

浅谈PDFlib中文输出（三）
1．PDF_show
void PDF_show(PDF *p, const char *text)
void PDF_show2(PDF *p, const char *text, int len)
在当前坐标用当前字体及字体大小输出文本。
PDF_show将认为字符串是以空字符结尾（NULL）；若字符串有可能含有空字符（如多字节字符串），用PDF_show2。
2．PDF_show_xy
void PDF_show_xy(PDF *p, const char *text, double x, double y)
void PDF_show_xy2(PDF *p, const char *text, int len, double x, double y)
在给出的坐标用当前字体及字体大小输出文本。
PDF_show_xy将认为字符串是以空字符结尾（NULL）；若字符串有可能含有空字符（如多字节字符串），用PDF_show_xy2。

3．PDF_continue_text
void PDF_continue_text(PDF *p, const char *text)
void PDF_continue_text2(PDF *p, const char *text, int len)
在下一行用当前字体及字体大小输出文本。
PDF_continue_xy将认为字符串是以空字符结尾（NULL）；若字符串有可能含有空字符（如多字节字符串），用PDF_continue_xy2。

4．PDF_fit_textline
void PDF_fit_textline(PDF*p, const char *text, int len, double x, double y, const char *optlist)
在给出的坐标根据选择项输出一行文本。
若字符串是以空字符结尾（NULL），len为0；否则，给出具体字节数。

5．PDF_fit_textflow
int PDF_create_textflow(PDF *p, const char *text, int len, const char *optlist)
建立文本流对象,并预处理文本为下面的文本格式化做准备。
若字符串是以空字符结尾（NULL），len为0；否则，给出具体字节数。
const char *PDF_fit_textflow(PDF *p, int textflow, double llx, double lly, double urx, double ury, const char *optlist)
将文本输出到相应的矩形块中。
lly, llx, ury, urx, 分别是矩形块左下角及右上角的纵横坐标。
void PDF_delete_textflow(PDF *p, int textflow)
删除文本流对象及相关数据结构。

小结

1，2， 3 组函数简洁，直观，易用。4，5组函数可通过对选择项的控制而输出更灵活的文本格式。尤其是第5组函数，是专门为多行文本设计的，可通过选项控制对齐，字间距，边框显示，旋转等。但4，5组函数有个局限，在字符串是多字节时，它们只能处理Unicode类编码。换而言之，他们不支持cp936编码。

下面是一个相关的例子--C 源程序(下载源代码中包含了生成的pdf文件 –PDFlib_cs3.pdf)。

/*******************************************************************/
/* This example demostrates different ways to output Chinese Simplified text
/* under Chinese Simplifed Windows.
/*******************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "pdflib.h"
int main(void)
{
PDF         *p = NULL;
int         i = 0, j = 0, Left = 50, Top = 800, Right = 545;
int         Font_E = 0, Font_CS = 0, Font_CS2 = 0, TextFlow = 0;
char        TextUnicode[] = "\x80\x7B\x53\x4F\x2D\x4E\x87\x65";
char        TextCp936[] = "\xBC\xF2\xCC\xE5\xD6\xD0\xCE\xC4";
/* create a new PDFlib object */
if ((p = PDF_new()) == (PDF *) 0)
{
printf("Couldn't create PDFlib object (out of memory)!\n");
return(2);
}
PDF_TRY(p) {
if (PDF_begin_document(p, "pdflib_cs3.pdf", 0, "") == -1)
{
printf("Error: %s\n", PDF_get_errmsg(p));
return(2);
}
PDF_set_info(p, "Creator", "pdflib_cs3.c");
PDF_set_info(p, "Author", "myi@pdflib.com");
PDF_set_info(p, "Title", "Different Ways To Output Chinese Simplify");
/* Start a new page. */
PDF_begin_page_ext(p, a4_width, a4_height, "");
Font_E = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", "");
Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", "");
Font_CS2 = PDF_load_font(p, "STSong-Light", 0, "GB-EUC-H", "");
/* Using PDF_set_text_pos and PDF_show functions. */
PDF_setfont(p, Font_E, 20);
PDF_set_text_pos(p, Left, Top);
PDF_show(p, "Using PDF_set_text_pos and PDF_show to output text:");
Top-=30;
PDF_set_text_pos(p, Left+20, Top);
PDF_show(p, "UniGB-UCS2-H encoding:");
PDF_setfont(p, Font_CS, 24);
Top-=30;
PDF_set_text_pos(p, Left+20, Top);
PDF_show2(p, TextUnicode, 8);
Top-=30;
PDF_setfont(p, Font_E, 20);
PDF_set_text_pos(p, Left+20, Top);
PDF_show(p, "GB-EUC-H encoding:");
PDF_setfont(p, Font_CS2, 24);
Top-=30;
PDF_set_text_pos(p, Left+20, Top);
PDF_show2(p, TextCp936, 8);
/* Using PDF_show_xy function. */
Top-=50;
PDF_setfont(p, Font_E, 20);
PDF_show_xy(p, "Using PDF_show_xy to output text:" , Left,  Top);
Top-=30;
PDF_show_xy(p, "UniGB-UCS2-H encoding:" , Left+20,  Top);
PDF_setfont(p, Font_CS, 24);
Top-=30;
PDF_show_xy2(p, TextUnicode, 8, Left+20,  Top);
Top-=30;
PDF_setfont(p, Font_E, 20);
PDF_show_xy(p, "GB-EUC-H encoding:", Left+20,  Top);
Top-=30;
PDF_setfont(p, Font_CS2, 24);
PDF_show_xy2(p, TextCp936, 8, Left+20,  Top);
/* Using PDF_continue_text function. */
Top-=30;
PDF_setfont(p, Font_E, 20);
PDF_set_text_pos(p, Left, Top);
PDF_continue_text(p, "Using PDF_continue_text to output text:");
Top-=30;
PDF_set_text_pos(p, Left+20, Top);
PDF_continue_text(p, "UniGB-UCS2-H encoding:");
PDF_setfont(p, Font_CS, 24);
PDF_continue_text2(p, TextUnicode, 8);
PDF_setfont(p, Font_E, 20);
PDF_continue_text(p, "GB-EUC-H encoding:");
PDF_setfont(p, Font_CS2, 24);
PDF_continue_text2(p, TextCp936, 8);
/* Using PDF_fit_textline function. */
Top-=140;
PDF_setfont(p, Font_E, 20);
PDF_fit_textline(p, "Using PDF_fit_textline to output text:", 0, Left, Top, "");
Top-=30;
PDF_fit_textline(p, "UniGB-UCS2-H encoding:", 0, Left+20, Top, "");
PDF_setfont(p, Font_CS, 24);
Top-=30;
PDF_fit_textline(p, TextUnicode, 8, Left+20, Top, "");
/* Using PDF_create_textflow, PDF_fit_textflow and PDF_delete_textflow function. */
Top-=30;
PDF_setfont(p, Font_E, 20);
TextFlow = PDF_create_textflow(p,
"Using PDF_create_textflow, PDF_fit_textflow and PDF_delete_textflow to output text:",
0, "fontname=Helvetica-Bold fontsize=20 encoding=winansi");
PDF_fit_textflow(p, TextFlow, Left, Top, Right, Top-60, "");
Top-=60;
TextFlow = PDF_create_textflow(p, "UniGB-UCS2-H encoding:", 0,
"fontname=Helvetica-Bold fontsize=20 encoding=winansi");
PDF_fit_textflow(p, TextFlow, Left+20, Top, Right, Top-30, "");
Top-=30;
TextFlow = PDF_create_textflow(p, TextUnicode, 8, "fontname=STSong-Light
fontsize=24 encoding=UniGB-UCS2-H textlen=8");
PDF_fit_textflow(p, TextFlow, Left+20, Top, Right, Top-30, "");
PDF_delete_textflow(p, TextFlow);
/* End of page. */
PDF_end_page_ext(p, "");
PDF_end_document(p, "");
}
PDF_CATCH(p) {
printf("PDFlib exception occurred in pdflib_cs3 sample:\n");
printf("[%d] %s: %s\n",
PDF_get_errnum(p), PDF_get_apiname(p), PDF_get_errmsg(p));
PDF_delete(p);
return(2);
}
PDF_delete(p);
return 0;
}

姓名：(无权查看)
城市：(无权查看)
院校：(无权查看)

浅谈PDFlib中文输出（四）
PDFlib的textformat参数用以设定文本输入形式，其有效值如下:

bytes: 在字符串中每个字节对应于一个字符。主要应用于8位编码。
utf8：字符串是 UTF-8编码。
ebcdicutf8：字符串是EBCDIC的UTF-8编码，只应用于IBM iSeries和zSeries。
utf16：字符串是 UTF-16编码。如果字符串是以Unicode的标记字节顺序号(BOM)开始，PDFlib会接收BOM信息后将其从字符串首移去。如果字符串不带BOM，字符串的字节顺序将取决于主机的字节顺序。Intel x86系统是小尾（little-endian，0xFFFE ）, 而Sparc和PowerPC系统是大尾（big-endian, 0xFEFF)。
utf16be：字符串是大尾字节顺序的UTF-16编码。对BOM没有特殊处理。
utf16le：字符串是小尾字节顺序的UTF-16编码。对BOM没有特殊处理。
auto:对于8位编码，它相当于“bytes”, 对于宽字符字符串(Unicode, glyphid, UCS2 或UTF16 CMap)，它相当于“utf16”。

在编程语言里，我们将可以自动处理Unicode字符串的语言称为支持Unicode语言（Unicode-capable），它们是COM, .NET, Java, REALbasic及Tcl等。对于需对Unicode字符串进行特殊处理的语言称为不支持Unicode语言（non-Unicode-capable），它们是C, C++, Cobol, Perl, PHP, Python 及RPG等。
在non-Unicode-capable语言里，“auto”设置将会正确处理大部分文本字符串。
对于Unicode-capable语言，textformat参数的缺省值是“utf16”；而non-Unicode-capable语言的缺省值是“auto”。
除此之外，PDFlib还支持在SGML和HTML经常使用的字符引用方法（Character Reference）。前提是将参数charref设成真, textformat设成“bytes”:

PDF_set_parameter(p, "charref", "true");
PDF_set_parameter(p, "textformat", "bytes");

下面给出一些有效的Character Reference:
soft hyphen
soft hyphen
 soft hyphen
€ Euro glyph (hexadecimal)
€ Euro glyph (decimal)
€ Euro glyph (entity name)
< less than sign
> greater than sign
& ampersand sign
Α Greek Alpha
下面是一个相关的例子--C 源程序(附上生成的pdf文件 –PDFlib_cs4.pdf)。

/*******************************************************************/
/* This example demostrates output Chinese Simplified text with different
/* 'textformat' option under Chinese Simplifed Windows.
/*******************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "pdflib.h"
int main(void)
{
PDF             *p = NULL;
int                 Font_E = 0, Font_H = 0, Font_CS = 0, Left = 50, y = 800, i = 0;
const int       INCRY = 25;
char              text[128], buf[128];
/* 1 byte text (English: "Simplified Chinese") */
static const char byte_text[] =
"\123\151\155\160\154\151\146\151\145\144\040\103\150\151\156\145\163\145";
static const int byte_len = 18;
static const char byte2_text[] = {0x53,0x69,0x6D,0x70,0x6C,0x69,0x66,0x69,0x65,
0x64,0x20,0x43,0x68,0x69,0x6E,0x65,0x73,0x65};
static const int byte2_len = 18;
/* 2 byte text (Simplified Chinese) */
static const unsigned short utf16_text[] = {0x7B80,0x4F53,0x4E2D,0x6587};
static const int utf16_len = 8;
static const unsigned char utf16be_text[] ="\173\200\117\123\116\055\145\207";
static const int utf16be_len = 8;
static const unsigned char utf16be_bom_text[] = "\376\377\173\200\117\123\116\055\145\207";
static const int utf16be_bom_len = 10;
static const unsigned char utf16le_text[] ="\200\173\123\117\055\116\207\145";
static const int utf16le_len = 8;
static const unsigned char utf16le_bom_text[] = "\377\376\200\173\123\117\055\116\207\145";
static const int utf16le_bom_len = 10;
static const unsigned char utf8_text[] = "\347\256\200\344\275\223\344\270\255\346\226\207";
static const int utf8_len = 12;
static const unsigned char utf8_bom_text[] = "\xEF\xBB\xBF\xE7\xAE\x80\xE4\xBD\x93\xE4\xB8\xAD\xE6\x96\x87";
static const int utf8_bom_len = 15;
static const char htmlutf16_text[] = "简体中文";
static const int htmlutf16_len = sizeof(htmlutf16_text) - 1;
typedef struct
{
char *textformat;
char *encname;
const char *textstring;
const int  *textlength;
const char *bomkind;
} TestCase;
static const TestCase table_8[] = {
{ "bytes",      "winansi",  (const char *)byte_text,         &byte_len,      ""},
{ "auto",        "winansi",  (const char *)byte_text,         &byte_len,      ""},
{ "bytes",      "winansi",  (const char *)byte2_text,       &byte2_len,     ""}, };
static const TestCase table_16[] =  {
{ "auto",  "unicode",  (const char *)utf16_text,       &utf16_len,      ""},
{ "utf16", "unicode",  (const char *)utf16_text,       &utf16_len,      ""},
{ "auto",  "unicode",  (const char *)utf16be_bom_text, &utf16be_bom_len, ", UTF-16+BE-BOM"},
{ "auto",     "unicode",     (const char *)utf16le_bom_text, &utf16le_bom_len, ", UTF-16+LE-BOM"},
{ "utf16be", "unicode",    (const char *)utf16be_text,         &utf16be_len,    ""},
{ "utf16le",   "unicode",   (const char *)utf16le_text,           &utf16le_len,    ""},
{ "utf8",       "unicode",    (const char *)utf8_text,               &utf8_len,       ""},
{ "auto",       "unicode",   (const char *)utf8_bom_text,      &utf8_bom_len, ", UTF-8+BOM"},
{ "bytes", "unicode",   (const char *)htmlutf16_text, &htmlutf16_len, ", HTML unicode character"}, };
const int   tsize_8 = sizeof table_8 / sizeof (TestCase);
const int   tsize_16 = sizeof table_16 / sizeof (TestCase);
/* create a new PDFlib object */
if ((p = PDF_new()) == (PDF *) 0)
{
printf("Couldn't create PDFlib object (out of memory)!\n");
return(2);
}
PDF_TRY(p) {
if (PDF_begin_document(p, "pdflib_cs4.pdf", 0, "") == -1)
{
printf("Error: %s\n", PDF_get_errmsg(p));
return(2);
}
PDF_set_info(p, "Creator", "pdflib_cs4.c");
PDF_set_info(p, "Author", "myi@pdflib.com");
PDF_set_info(p, "Title", "Output Chinese Simplify with Different textformat");
/* Start a new page. */
PDF_begin_page_ext(p, a4_width, a4_height, "");
Font_H = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", "");
/* 8-bit encoding */
Font_E = PDF_load_font(p, "Times", 0, "winansi", "");
PDF_setfont(p, Font_H, 24);
PDF_show_xy(p, "8-bit encoding", Left+40,  y);
y -= 2*INCRY;
for (i = 0; i < tsize_8; ++i)
{
PDF_setfont(p, Font_H, 14);
sprintf(text, "%s encoding, %s textformat %s: ", table_8[i].encname,
table_8[i].textformat, table_8[i].bomkind);
PDF_show_xy(p, text, Left,  y);
y -= INCRY;
PDF_set_parameter(p, "textformat", table_8[i].textformat);
PDF_setfont(p, Font_E, 14);
PDF_show_xy(p, table_8[i].textstring, Left,  y);
y -= INCRY;
} /* for */
/* 16-bit encoding */
PDF_setfont(p, Font_H, 24);
y -= 2*INCRY;
PDF_show_xy(p, "16-bit encoding", Left+40,  y);
y -= 2*INCRY;
PDF_set_parameter(p, "charref", "true");
Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", "");
for (i = 0; i < tsize_16; i++)
{
PDF_setfont(p, Font_H, 14);
sprintf(text, "%s encoding, %s textformat %s: ", table_16[i].encname,
table_16[i].textformat, table_16[i].bomkind);
PDF_show_xy(p, text, Left,  y);
y -= INCRY;
PDF_setfont(p, Font_CS, 14);
sprintf(buf, "textformat %s",table_16[i].textformat);
PDF_fit_textline(p, table_16[i].textstring, *table_16[i].textlength,
Left, y, buf);
y -= INCRY;
} /* for */
/* End of page. */
PDF_end_page_ext(p, "");
PDF_end_document(p, "");
}
PDF_CATCH(p) {
printf("PDFlib exception occurred in pdflib_cs4 sample:\n");
printf("[%d] %s: %s\n",
PDF_get_errnum(p), PDF_get_apiname(p), PDF_get_errmsg(p));
PDF_delete(p);
return(2);
}
PDF_delete(p);
return 0;
}

姓名：(无权查看)
城市：(无权查看)
院校：(无权查看)

浅谈PDFlib中文输出（五）
一般来说, 每种基本字体, 都会有在其基础上变化字形的附加字体。比如，字体Arial, 就有其附加字体Arial Bold （粗体）, Arial Italic（斜体）, 及Arial Bold Italic（粗斜体）。一般你都可以找到或购买到相应的附加字体。
但有时为了应急，或对字体字形没有非常严格的要求。在这样的情况下，我们可以采用人工字形生成（Artificial font styles）。Artificial font styles是Acrobat的一个功能，它根据基本字形而模拟生成粗体，斜体及粗斜体。PDFlib支持这一功能，并遵守Acrobat对此功能的限制。目前此功能之局限于：
1． Acrobat标准字体, 就简体中文来说也就是PDFlib自带的STSong-Light，AdobeSongStd-Light-Acro，及STSongStd-Light-Acro三种简体中文字体。
2． PDFlib可以访问的.otf OpenType字体，并使用表1.1的编码(见《浅谈PDFlib中文输出(一)》), 且“embedding”参数设为假。

下面是一个相关的例子--C 源程序(附上生成的pdf文件 –PDFlib_cs5.pdf)。

/*******************************************************************/
/* This example demostrates the usage of Artificial font styles
/* under Chinese Simplifed Windows.
/*******************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "pdflib.h"
int main(void)
{
PDF              *p = NULL;
int                 Font_H = 0, Font_CS = 0, Left = 50, y = 700;
const int        INCRY = 25;
const char     TextUnicode[] = "\x80\x7B\x53\x4F\x2D\x4E\x87\x65";
const int        TEXTLEN = 8;
/* create a new PDFlib object */
if ((p = PDF_new()) == (PDF *) 0)
{
printf("Couldn't create PDFlib object (out of memory)!\n");
return(2);
}
PDF_TRY(p) {
if (PDF_begin_document(p, "pdflib_cs5.pdf", 0, "") == -1)
{
printf("Error: %s\n", PDF_get_errmsg(p));
return(2);
}
PDF_set_info(p, "Creator", "pdflib_cs5.c");
PDF_set_info(p, "Author", "myi@pdflib.com");
PDF_set_info(p, "Title", "Usage of Artificial font styles");
/* Start a new page. */
PDF_begin_page_ext(p, a4_width, a4_height, "");
Font_H = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", "");
PDF_setfont(p, Font_H, 24);
PDF_show_xy(p, "Artificial Font Styles", Left + 100,  y);
/* Normal */
y -= 2 * INCRY;
PDF_setfont(p, Font_H, 14);
PDF_show_xy(p, "Normal", Left,  y);
y -= INCRY;
Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", "");
PDF_setfont(p, Font_CS, 14);
PDF_show_xy2(p, TextUnicode, TEXTLEN, Left,  y);
/* Italic */
y -= 2 * INCRY;
PDF_setfont(p, Font_H, 14);
PDF_show_xy(p, "Italic", Left,  y);
y -= INCRY;
Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", "fontstyle italic");
PDF_setfont(p, Font_CS, 14);
PDF_show_xy2(p, TextUnicode, TEXTLEN, Left,  y);
/* Bold */
y -= 2 * INCRY;
PDF_setfont(p, Font_H, 14);
PDF_show_xy(p, "Bold", Left,  y);
y -= INCRY;
Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", "fontstyle bold");
PDF_setfont(p, Font_CS, 14);
PDF_show_xy2(p, TextUnicode, TEXTLEN, Left,  y);
/* Bold-italic */
y -= 2 * INCRY;
PDF_setfont(p, Font_H, 14);
PDF_show_xy(p, "Bold-italic", Left,  y);
y -= INCRY;
Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H",
"fontstyle bolditalic");
PDF_setfont(p, Font_CS, 14);
PDF_show_xy2(p, TextUnicode, TEXTLEN, Left,  y);
/* End of page. */
PDF_end_page_ext(p, "");
PDF_end_document(p, "");
}
PDF_CATCH(p) {
printf("PDFlib exception occurred in pdflib_cs5 sample:\n");
printf("[%d] %s: %s\n",  PDF_get_errnum(p),  PDF_get_apiname(p),
PDF_get_errmsg(p));
PDF_delete(p);
return(2);
}
PDF_delete(p);
return 0;
}


	W 3 C h i n a ( since 2003 ) 旗下站点苏ICP备05006046号《全国人大常委会关于维护互联网安全的决定》《计算机信息网络国际联网安全保护管理办法》	140.625ms