OSDN > ソフトウェアを探す > LuaTeX-ja > SCM > git > luatexja > コミット

LuaTeX-ja
Fork

R/O
HTTP
SSH
HTTPS

luatexja: コミット

ソースコードの管理場所

コミットメタ情報

リビジョン	c85b9aa3a044213f560857e22da5b95f591a6f33 (tree)
日時	2011-11-21 20:01:35
作者	Hironori Kitagawa <h_kitagawa2001@yaho...>
コミッター	Hironori Kitagawa

ログメッセージ

Updated the draft for post-proceedings.

変更サマリ

modified: doc/ajt-devel-ltja.tex (diff)

差分

--- a/doc/ajt-devel-ltja.tex

+++ b/doc/ajt-devel-ltja.tex

		@@ -74,21 +74,22 @@ internal processing methods of \LuaTeX-ja.
74	74	To typeset Japanese documents with \TeX, ASCII \pTeX~\cite{ptex} has
75	75	been widely used in Japan. There are other methods---for example, using
76	76	Omega and OTP~\cite{omega}, or with the CJK package---to do so, however,
77		-these alternative methods did not become a majority. The author thinks
	77	+these alternative methods did not become majority. The author thinks
78	78	that this is because \pTeX\ enables us to produce high-quality documents
79	79	(e.g.,~supporting vertical typesetting), and the appearance of \pTeX\ is
80	80	earlier than that of alternatives described above.
81	81
82		-However, \pTeX\ has been left behind from the extensions of \TeX\
83		-such as \eTeX\ and \pdfTeX, and the diffusion of UTF-8 encoding. In
84		-recent years, the situation has become better, because of development
85		-of \|ptexenc\|~\cite{ptexenc} by Nobuyuki Tsuchimura (\hbox{土村展之}),
	82	+However, \pTeX\ has been left behind from the extensions of \TeX\ such
	83	+as \eTeX\ and \pdfTeX, and the diffusion of UTF-8 encoding. In recent
	84	+years, the situation has become better, by development of
	85	+\|ptexenc\|~\cite{ptexenc} by Nobuyuki Tsuchimura (\hbox{土村展之}),
86	86	$\varepsilon$-\pTeX~\cite{eptex} by the author,~and u\pTeX~\cite{uptex}
87		-by Takuji Tanaka (田中琢爾). However, continuing this approach, namely, to develop
88		-an engine extension localized for Japanese, is not wise. This approach
89		-needs lots of work for \emph{each} engine, and since \LuaTeX\ has an ability
90		-to hook \TeX's internal process by using Lua callbacks, the necessity of
91		-an engine extension is getting smaller.
	87	+by Takuji Tanaka (田中琢爾). However, continuing this approach, namely,
	88	+to develop an engine extension localized for Japanese, is not wise. This
	89	+approach needs lots of work for \emph{each} engine. In addition, if we
	90	+use \LuaTeX, the necessity of an engine extension is getting smaller
	91	+because \LuaTeX\ has an ability to hook \TeX's internal process by using
	92	+Lua callbacks.
92	93
93	94
94	95	There were several experimental attempts to typeset

		@@ -111,18 +112,18 @@ these situations.
111	112
112	113	\subsection{Development policy of \LuaTeX-ja}
113	114	\label{ssec-pol}
114		-The first aim of \LuaTeX-ja project is to implement features (from the
115		-`primitive' level) of \pTeX\ as macros under \LuaTeX, so \LuaTeX-ja is
116		-much affected by \pTeX. However, as development proceeds, some
117		-technical/conceptual difficulties are arisen. Hence we changed the aim
	115	+The first aim of \LuaTeX-ja project was to implement features (from the
	116	+`primitive' level) of \pTeX\ as macros under \LuaTeX, therefore \LuaTeX-ja is
	117	+much affected by \pTeX. However, as development proceeded, some
	118	+technical/conceptual difficulties arose. Hence we changed the aim
118	119	of the project as follows:
119	120	\begin{itemize}
120	121	\item\emph{\LuaTeX-ja offers at least the same flexibility of
121	122	typesetting that p\TeX\ has.}
122	123
123		- We think that the ability of producing outputs conformed to
	124	+ We are not satisfied with the ability of producing outputs conformed to
124	125	JIS~X~4051~\cite{jisx4051}, the Japanese Industrial Standard for
125		- typesetting, or to a technical note~\cite{w3c} by W3C is not enough;
	126	+ typesetting, or to a technical note~\cite{w3c} by W3C;
126	127	if one wants to produce very incoherent outputs for some reason, it
127	128	should be possible.
128	129	In this point, previous attempts of Japanese typesetting with \LuaTeX\

		@@ -144,59 +145,66 @@ In this point, previous attempts of Japanese typesetting with \LuaTeX\
144	145	\subsection{Overview of the processes}
145	146	\label{ssec-over}
146	147	We describe an outline of \LuaTeX-ja's process in order.
	148	+
147	149	\begin{itemize}
148	150	\item In the \|process_input_buffer\| callback: treatment of breaking
149	151	lines after a Japanese character (in Subsection~\ref{ssec-line}).
150	152
151	153	\item In the \|hyphenate\| callback: font replacement.
152	154
153		-\LuaTeX-ja looks into for each \textit{glyph\_node}~$p$ in the list. If
	155	+\LuaTeX-ja looks into for each \textit{glyph\_node}~$p$ in the horizontal list. If
154	156	the character represented by $p$ is considered as a Japanese
155		- character, the font used in $p$ is replaced by the value of
	157	+ character, the font used at $p$ is replaced by the value of
156	158	\|\ltj@curjfnt\|, an attribute for `the current Japanese font'
157	159	at~$p$.
158	160
159		-Furthermore the subtype of $p$ is subtracted by 1 to suppress
160		- hyphenation around it by \LuaTeX, because later processes of
	161	+Furthermore, the subtype of $p$ is subtracted by 1 to suppress
	162	+ hyphenation around $p$ by \LuaTeX, because later processes of
161	163	\LuaTeX-ja take care of all things about Japanese characters.
162	164
163	165	\item In \|pre_linebreak_filter\| and \|hpack_filter\| callbacks:
164	166
165	167	\begin{enumerate}
166	168	\item \LuaTeX-ja has its own stack system, and the current horizontal
167		- list is traversed in this stage to determine what is the level of
168		- \LuaTeX-ja's internal stack at the end of the list (in
169		- Subsection~\ref{ssec-stack}).
	169	+ list is traversed in this stage to determine what the level of
	170	+ \LuaTeX-ja's internal stack at the end of the list is. We will
	171	+ discuss it in Subsection~\ref{ssec-stack}.
170	172
171	173	\item In this stage, \LuaTeX-ja inserts glues/kerns for Japanese
172		- typesetting in the list. This is the core of \LuaTeX-ja (in
173		- Subsection~\ref{ssec-jglue}).
	174	+ typesetting in the list. This is the core routine of \LuaTeX-ja.
	175	+ We will discuss it in Subsections
	176	+ \ref{ssec-jglue}~and~\ref{ssec-jspec} .
174	177
175	178	\item To make a match between a metric and a real font, sometimes
176		- adjustument of the position of (Japanese) glyphs are performed
177		- (Subsection~\ref{ssec-width}).
	179	+ adjustument of the position of (Japanese) glyphs are performed.
	180	+ We will discuss it in Subsection~\ref{ssec-width}.
178	181	\end{enumerate}
179		-\item In the \|mlist_to_hlist\| callback: replacement of Japanese characters in math formulas.
180		-This stage is similar to adjustument of the position of glyphs (see
181		- above), so we omit it from this paper.
	182	+\item In the \|mlist_to_hlist\| callback: treatment of Japanese characters
	183	+ in math formulas. This stage is similar to adjustment of the
	184	+ position of glyphs (see above), so we omit to describe this stage
	185	+ from this paper.
182	186	\end{itemize}
183	187
	188	+In this paper, a \emph{alphabetic character} means a non-Japanese
	189	+character. Similarly, we use the word an \emph{alphabetic font} as the
	190	+counterpart of a jJpanese font.
	191	+
184	192	\subsection{Contents of this paper}
185	193	Here we describe the contents of the rest of this paper briefly. In
186		-Section~\ref{sec:differences_with_ptex},
187		-we describe major differences between \pTeX\ and \LuaTeX-ja.
188		-The next section, Section~\ref{sec:distinction_of_characters},
189		-is concentrated on a problem `how we
190		-distinguish between Japanese characters and alphabetic characters'. In
191		-Section~\ref{sec:current_status}, we show rest of features of \LuaTeX-ja package, and
192		-current status of the package. Finally, in Section~\ref{sec:implementation}, we describe some
193		-internal routines of \LuaTeX-ja.
	194	+Section~\ref{sec:differences_with_ptex}, we describe major differences
	195	+between \pTeX\ and \LuaTeX-ja. The next section,
	196	+Section~\ref{sec:distinction_of_characters}, is concentrated on a
	197	+problem how we distinguish between Japanese characters and alphabetic
	198	+characters. In Section~\ref{sec:current_status}, we show current
	199	+development status of the package. Finally, in
	200	+Section~\ref{sec:implementation}, we describe some internal routines of
	201	+\LuaTeX-ja.
194	202
195	203	\subsection{General information of the project}
196	204	This \LuaTeX-ja project is hosted by SourceForge.jp. The official wiki
197	205	is located on
198	206	\url{http://sourceforge.jp/projects/luatex-ja/wiki/}. There is
199		-no stable version on October 15, 2011, however a set of developer sources can be
	207	+no stable version on October 22, 2011, however a set of developer sources can be
200	208	obtained from the git repository. Members of the project team are as follows
201	209	(in random order): Hironori Kitagawa, Kazuki Maeda, Takayuki Yato,
202	210	Yusuke Kuroki, Noriyuki Abe, Munehiro Yamamoto, Tomoaki Honda,

		@@ -212,7 +220,7 @@ overview of \pTeX, please see Okumura~\cite{ptexjp}.
212	220
213	221	\subsection{Names of control sequences}
214	222	\label{ssec-csname} Because \pTeX\ is an engine modification of Knuth's
215		-original \TeX82 engine, some primitives added by it take a form that is
	223	+original \TeX82 engine, some of the additional primitives take a form that is
216	224	very difficult to be simulated by a macro. For example, an additional
217	225	primitive \|\prebreakpenalty\|$\langle\hbox{\it
218	226	char\_code}\rangle$\|[=]\|$\langle\hbox{\it penalty}\rangle$ in \pTeX\

		@@ -221,21 +229,19 @@ $\langle\hbox{\it char\_code}\rangle$ to $\langle\hbox{\it
221	229	penalty}\rangle$, and this form \|\prebreakpenalty\|$\langle\hbox{\it
222	230	char\_code}\rangle$ can be also used for retrieving the value.
223	231
224		-Moreover, there are some parameters which values of them at the end of a
225		-horizontal box or that of a paragraph are effective in whole box or
226		-paragraph. These parameters were implemented as additional internal
227		-parameters in \pTeX. However, the implementation of these parameters in
228		-\LuaTeX-ja is not so easy; we will discuss it in
229		-Subsection~\ref{ssec-stack}.
	232	+Moreover, there are some internal parameters of \pTeX\ which values of them at the end of a
	233	+horizontal box or that of a paragraph are valid in whole box or
	234	+paragraph. However, the implementation of these parameters in
	235	+\LuaTeX-ja is not so easy; we will discuss it in Subsection~\ref{ssec-stack}.
230	236
231		-From above two~problems we discussed above, the assignment and retrieval
	237	+From above two~problems discussed above, the assignment and retrieval
232	238	of most parameters in \LuaTeX-ja are summarized into the following
233	239	three~control sequences:
234	240	\begin{itemize}
235	241	\item \|\ltjsetparameter{\|$\langle\hbox{\it
236	242	name}\rangle$\|=\|$\langle\hbox{\it value}\rangle$\|,...}\|: for local
237	243	assignment.
238		-\item \|\ltjglobalsetparameter\|: for global assignment. These two control
	244	+\item \|\ltjglobalsetparameter\|: for global assignment. Note that these two control
239	245	sequences obey the value of \|\globaldefs\| primitive.
240	246	\item \|\ltjgetparameter{\|$\langle\hbox{\it
241	247	name}\rangle$\|}[{\|$\langle\hbox{\it optional

		@@ -272,7 +278,7 @@ letter `あ' will be treated as an alphabetic character by
272	278	\LuaTeX-ja. Then, it is natural to have a space between `あ' and `y' in
273	279	the output, where the actual output in the figure does not so. This is
274	280	because `あ' is considered a Japanese character by \LuaTeX-ja,
275		-when \LuaTeX-ja does a decision whether U+FFFFF will be added to the
	281	+when \LuaTeX-ja does the decision whether U+FFFFF will be added to the
276	282	input line~2.
277	283
278	284	\begin{figure}

		@@ -295,7 +301,7 @@ JFMs are essentially same, and only differ in their names. For example,
295	301	\|min10.tfm\| and \|goth10.tfm\|, which are JFMs shipped with \pTeX\ for
296	302	seriffed \emph{mincho} family and sans-seriffed \emph{gothic} family,
297	303	differ their \|FAMILY\| and \|FACE\| only. Moreover, \|jis.tfm\| and
298		-\|jisg.tfm\|, which consists a parts of \emph{jis} font metric, which is
	304	+\|jisg.tfm\|, which is included in the \emph{jis} font metric, which is
299	305	used in \emph{jsclasses}~\cite{jsclasses} by Haruhiko Okumura (奥村晴彦),
300	306	are totally same as binary files. Considering this situation, we
301	307	decided to separate `real' fonts and metrics used for them in

		@@ -305,14 +311,14 @@ remarks:
305	311	\begin{itemize}
306	312	\item A control sequence \|\jfont\| must be used for Japanese fonts, instead of \|\font\|.
307	313	\item \LuaTeX-ja automatically loads the \emph{luaotfload} package, so
308		- \|file:\| and \|name:\| prefixes, and various font features can be
309		- used as the line~1 in Figure~\ref{fig-jfdef}.
	314	+ \hbox{\tt file:} and \hbox{\tt name:} prefixes, and various font features can be
	315	+ used as the first line in Figure~\ref{fig-jfdef}.
310	316	\item The \|jfm\| key specifies the metric for the font. In
311	317	Figure~\ref{fig-jfdef}, both fonts will use a metric stored in a
312	318	Lua script named \|jfm-ujis.lua\|. This metric is the standard
313	319	metric in \LuaTeX-ja, and is based on JFMs used in the \emph{otf}
314	320	package~\cite{otf}.
315		-\item The \|psft:\| prefix can be used to specify name-only, non-embedded
	321	+\item The \hbox{psft:} prefix can be used to specify name-only, non-embedded
316	322	fonts. When one display a pdf with these fonts, actual fonts which
317	323	will be used for them depend on a pdf reader.
318	324	\end{itemize}

		@@ -326,7 +332,7 @@ metrics by default; \|jfm-ujis.lua\|, \|jfm-jis.lua\| based on the
326	332	\emph{jis} font metric, and \|jfm-min.lua\| based on old \|min10.tfm\|.
327	333
328	334	Note that \|-kern\| in features
329		-is important, because kerning information from real font itself will
	335	+is important, because kerning information from a real font itself will
330	336	clash with glue/kern informations from the metric.
331	337
332	338	\begin{figure}

		@@ -351,7 +357,7 @@ process will be done when a horizontal box or a paragraph is ended, so
351	357
352	358	The situation for Japanese characters is more complicated.
353	359	Glues (and kerns) which are needed for Japanese
354		-typesetting will be divided into the following three categories:
	360	+typesetting are divided into the following three categories:
355	361	\begin{itemize}
356	362	\item Glue (or kern) from the metric of Japanese fonts (\emph{JFM glue},
357	363	for short).

		@@ -385,6 +391,8 @@ this specification are to behave like alphabetic characters in \LuaTeX\
385	391	for \LuaTeX-ja's process.
386	392
387	393	\subsection{Insertion of glues/kerns for Japanese typesetting: specification}
	394	+\label{ssec-jspec}
	395	+
388	396	\begin{table}
389	397	\caption{Examples of differences between \pTeX\ and \LuaTeX-ja.}
390	398	\label{tab-jfmglue}

		@@ -422,16 +430,16 @@ Now we will take a look inside the insertion process itself, and describe 4~poin
422	430	\begin{description}
423	431	\item[Ignored Nodes]
424	432	As noted in the previous subsection, the insertion process in \pTeX\ can
425		- be interrupted by saying \|{}\| or anything else\footnote{This
	433	+ be interrupted by saying \|{}\| or anything else.\footnote{This
426	434	is why some tricks like \texttt{ちょ\char`\{\char`\}っと} for
427		- \texttt{min10.tfm} and other `old' JFMs work.}. This leads
428		- the second row in Table~\ref{tab-jfmglue}, or
429		- Figure~\ref{fig-ptexjfm}. `The process is interrupted' means
430		- that \pTeX\ does not think the letter `】\inhibitglue' is
431		- followed by `\inhibitglue【', hence two half-width glues are
432		- inserted between between `】\inhibitglue' and `\inhibitglue【',
433		- where one is from `】\inhibitglue' and another is from
434		- `\inhibitglue【'.
	435	+ \texttt{min10.tfm} and other `old' JFMs work.} This leads the
	436	+ second row in Table~\ref{tab-jfmglue}, or
	437	+ Figure~\ref{fig-ptexjfm}. Here `the process is interrupted'
	438	+ means that \pTeX\ does not think the letter `】\inhibitglue'
	439	+ is followed by `\inhibitglue【', hence two half-width glues
	440	+ are inserted between `】\inhibitglue' and `\inhibitglue【',
	441	+ where the left one is from `】\inhibitglue' and the right one
	442	+ is from `\inhibitglue【'.
435	443
436	444	On the other hand, in \LuaTeX-ja, the process is done inside
437	445	\|hpack_filter\| and \|pre_linebreak_filter\| callbacks. Hence,

		@@ -444,14 +452,14 @@ As noted in the previous subsection, the insertion process in \pTeX\ can
444	452	\emph{penalty\_node}---, as shown in (4).
445	453
446	454
447		-By the way, around a \emph{glyph\_node} $p$ there may be some nld odes
	455	+By the way, around a \emph{glyph\_node} $p$ there may be some nodes
448	456	attached to $p$. These are an accent and kerns for
449		- positioning it, and a kern from the italic
	457	+ moving it to the right place, and a kern from the italic
450	458	correction\footnote{\TeX82 (and \LuaTeX) does not distinguish
451	459	between explicit kern and a kern for italic correction. To
452		- distinguish them, an additional subtype for kern is introduced
	460	+ distinguish them, an additional subtype for a kern is introduced
453	461	in \pTeX. On the other hand, \LuaTeX-ja uses an additional attribute and
454		- redefines \texttt{\char`\\/}.} for $p$. It is natural that
	462	+ redefines \texttt{\char`\\/} to set this attribute.} for $p$. It is natural that
455	463	these attachments should be ignored inside the process. Hence
456	464	\LuaTeX-ja takes this approach, as the latest version of
457	465	\pTeX\ (p3.2). This explains (2) in the figure.

		@@ -485,7 +493,7 @@ However this seems to be unnatural, since two Japanese fonts in the
485	493	\mc 明朝）\gt （ゴシック
486	494	\end{quote}
487	495	One might have the situation that this default behavior is not
488		- suitable. \LuaTeX-ja offers a way to cope with this case, but
	496	+ suitable. \LuaTeX-ja offers a way to handle this situation, but
489	497	we leave it to the manual~\cite{man}.
490	498
491	499	\item[Fonts with Different Metrics]

		@@ -503,9 +511,9 @@ As the previous paragraph, this input yields the following, by \pTeX:
503	511	\mc 漢）\hbox{}\gt （漢）\hbox{}\large （大
504	512	\end{quote}
505	513	We thought that amounts of spaces between parentheses in above output
506		- are too much. So we changed the default behavior of
507		- \LuaTeX-ja so that the amount of a glue between two Japanese
508		- characters with different metrics is the average of a glue
	514	+ are too much. Hence we changed the default behavior of
	515	+ \LuaTeX-ja, so that the amount of a glue between two Japanese
	516	+ characters with different metrics is the \emph{average} of a glue
509	517	from the left character and that from the right
510	518	character. For example, Figure~\ref{fig-diffmet} shows the
511	519	output from above input. The width of glue indicated `(1)' is

		@@ -538,33 +546,32 @@ We thought that amounts of spaces between parentheses in above output
538	546
539	547	\item[\emph{kanjiskip} and \emph{xkanjiskip}]
540	548	In \pTeX, the value of \emph{xkanjiskip} is controlled by a skip named
541		- \|\xkanjiskip\|. A defect of this implementation is that the
542		- value of \emph{xkanjiskip} is not connected with the size of
543		- the currnt Japanese font. It seems that \|EXTRASPACE\|,
	549	+ \|\xkanjiskip\|. A well-known defect of this implementation is
	550	+ that the value of \emph{xkanjiskip} is not connected with the
	551	+ size of the currnt Japanese font. It seems that \|EXTRASPACE\|,
544	552	\|EXTRASTRETCH\|, \|EXTRASHRINK\| parameters in a JFM are
545	553	reserved for specifying the default value of
546	554	\emph{xkanjiskip} in a unit of the design size, but \pTeX\
547		- did not use these parameters.
	555	+ did not use these parameters, actually.
548	556
549	557	Considering this situation of p\TeX, \LuaTeX-ja can use the value of
550	558	\emph{xkanjiskip} that specified in a metric. If the value of
551		- \emph{xkanjiskip} on user side (this is the
552		- \textsf{xkanjiskip} parameter in \|\ltjsetparameter\|) is
	559	+ \emph{xkanjiskip} on user side (this is the value of
	560	+ \textsf{xkanjiskip} parameter of \|\ltjsetparameter\|) is
553	561	\|\maxdimen\|, then \LuaTeX-ja use the specification from
554	562	the current used metric as the actual value of
555		- \emph{xkanjiskip}.
556		-This description also applies for \emph{kanjiskip}.
	563	+ \emph{xkanjiskip}. This description also applies for \emph{kanjiskip}.
557	564	\end{description}
558	565
559	566	\section{Distinction of characters}
560		-\label{sec:distinction_of_characters}
561		-Since \LuaTeX\ can handle Unicode characters natively, it is a major
562		-problem that how we distinguish Japanese characters and alphabetic
563		-characters. For example, the multiplication sign (U+00D7) exists both in
564		-ISO-8859-1 (hence in Latin-1 Supplement in Unicode) and in the basic
565		-Japanese character set JIS~X~0208. It is not desirable that this
566		-character is treated as an alphabetic char, because this symbol is often
567		-used in the sense of `negative' in Japan.
	567	+\label{sec:distinction_of_characters} Since \LuaTeX\ can handle Unicode
	568	+characters natively, it is a major problem that how we distinguish
	569	+Japanese characters and alphabetic characters. For example, the
	570	+multiplication sign (U+00D7) exists both in ISO-8859-1 (hence in Latin-1
	571	+Supplement in Unicode) and in the basic Japanese character set
	572	+JIS~X~0208. It is not desirable that this character is always treated as
	573	+an alphabetic character, because this symbol is often used in the sense
	574	+of `negative' in Japan.
568	575
569	576	\subsection{Character ranges}
570	577	Before we describe the approach taken is \LuaTeX-ja, we review the

		@@ -573,13 +580,13 @@ approach taken by u\pTeX. u\pTeX\ extends the \|\kcatcode\| primitive in
573	580	among alphabetic characters~(15), \emph{kanji}~(16), \emph{kana}~(17),
574	581	\emph{kanji}, \emph{Hangul}~(17), or~\emph{other CJK characters}~(18).
575	582	The assignment to \|\kcatcode\| can be done by a Unicode
576		-block\footnote{There are some exceptions. For example, U+FF00--FFEF
	583	+block.\footnote{There are some exceptions. For example, U+FF00--FFEF
577	584	(Halfwidth and Fullwidth Forms) are divided into three blocks in recent
578		-u\pTeX.}.
	585	+u\pTeX.}
579	586
580	587	\LuaTeX-ja adopted a different approach. There are many Unicode blocks
581	588	in Basic Multilingual Plane which are not included in
582		- Japanese fonts, it is inconvenient if we treat by a Unicode
	589	+ Japanese fonts, therefore it is inconvenient if we process by a Unicode
583	590	block. Furthermore, JIS~X~0208 are not just union of Unicode
584	591	blocks; for example, the intersection of JIS~X~0208 and
585	592	Latin-1 Supplement is shown in

		@@ -607,14 +614,14 @@ u\pTeX.}.
607	614
608	615	%%Example...
609	616
610		-We note that \LuaTeX-ja offers two additional control sequence,
	617	+We note that \LuaTeX-ja offers two additional control sequences,
611	618	\|\ltjjachar\| and \|\ltjalchar\|. They are similar to \|\char\|
612		- primitive, but \|\ltjjachar\| always yields a Japanese character (if
613		- the argument is more than or equal to 128) and \|\ltjalchar\| always
	619	+ primitive, however \|\ltjjachar\| always yields a Japanese character, provided that
	620	+ the argument is more than or equal to 128, and \|\ltjalchar\| always
614	621	yields an alphabetic character, regardless of the argument.
615	622
616	623	\subsection{Default setting of ranges}
617		-Patches for plain \TeX\ and \LaTeXe of \LuaTeX-ja predefines 8~character
	624	+Patches for plain \TeX\ and \LaTeXe\ of \LuaTeX-ja predefine 8~character
618	625	ranges, as shown in Table~\ref{tab-chrrng}. Almost of these ranges are
619	626	just the union of Unicode blocks, and determined from the Adobe-Japan1-6
620	627	character collection~\cite{aj16}, and JIS~X~0208. Among these 8~ranges,

		@@ -659,19 +666,19 @@ This is because some 8-bit TFMs have a glyph in this range; for example,
659	666	\subsection{Control sequences producing Unicode characters}
660	667	\label{ssec-unichar}
661	668
662		-The \emph{fontspec} package\footnote{Preciously
663		-saying, it is the \emph{xunicode} package, originally a package for
664		-\XeTeX and automatically loaded by the \emph{fontspec} package.} offer
665		-various control sequences that produce Unicode characters. However, they as
666		-it stands cannot work with the default range setting of \LuaTeX-ja. For
667		-example, \|\textquotedblleft\| is just an abbreviation of
668		-\|\char"201C\relax\| %"
669		-and the character U+201C (LEFT DOUBLE QUOTATION
670		-MARK) is treated as an Japanese character, because it belongs to the
671		-range~3.
672		-This problem is resolved by using \|\ltjalchar\| instead of the \|\char\| primitive.
673		-It is included in an optional package named \texttt{luatexja-\penalty0fontspec.sty}.
674		-Figure~\ref{fig-unitxt} ...
	669	+The \emph{fontspec} package\footnote{Preciously saying, it is the
	670	+\emph{xunicode} package, originally a package for \XeTeX and
	671	+automatically loaded by the \emph{fontspec} package.} offers various
	672	+control sequences that produce Unicode characters. However, these
	673	+control sequences as it stands cannot work correctly with the default
	674	+range setting of \LuaTeX-ja. For example, \|\textquotedblleft\| is just
	675	+an abbreviation of \|\char"201C\relax\|, and the character U+201C (LEFT %"
	676	+DOUBLE QUOTATION MARK) is treated as an Japanese character, because it
	677	+belongs to the range~3. This problem is resolved by using \|\ltjalchar\|
	678	+instead of the \|\char\| primitive. It is included in an optional package
	679	+named \texttt{luatexja-\penalty0fontspec.sty}. Figure~\ref{fig-unitxt}
	680	+shows several ways o typeset a character , both as a Japanese character
	681	+and as as an alphabetic characters.
675	682
676	683	\begin{figure}
677	684	\begin{LTXexample}

		@@ -685,7 +692,7 @@ Figure~\ref{fig-unitxt} ...
685	692	\end{figure}
686	693
687	694	The situation looks similar in math formulas, but in fact it differs.
688		-Control sequences that represents ordinary symbols defined by the
	695	+Each control sequence that represents an ordinary symbol defined by the
689	696	\emph{unicode-math} package is just synonym of a character. For example,
690	697	the meaning of \|\otimes\| is just the character U+2297 (CIRCLED TIMES),
691	698	which is included in the range~3. However, it is difficult to define a

		@@ -693,11 +700,11 @@ control sequence like \|\ltjalUmathchar\| as a counterpart of
693	700	\|\Umathchar\|, since an input like `\|\sum^\ltjalUmathchar ...\|' has to be
694	701	permitted.
695	702
696		-However, we couldn't include a solution to this problem in time for this
697		-paper, due to a lack of time. We are just testing a solution that we
698		-will explain it below:
	703	+However, we couldn't develop a satisfactory solution to this problem in
	704	+time for this paper, due to a lack of time. We are just testing a
	705	+solution below:
699	706	\begin{itemize}
700		-\item \LuaTeX-ja has a list of character codes which will be treated as
	707	+\item \LuaTeX-ja has a list of character codes which will be always reated as
701	708	alphabetic characters in math mode. Considering 8-bit TFMs for
702	709	math symbols, this list includes natural numbers between \|"80\| and
703	710	\|"FF\| by default.

		@@ -708,7 +715,7 @@ codes of characters which are mentioned in the \emph{unicode-math}
708	715	\end{itemize}
709	716
710	717
711		-We would like to extend treatments described in this section to 8-bit
	718	+We would like to extend treatments described in this subsection to 8-bit
712	719	font encodings, but we leave it to further development too.
713	720
714	721	\section{Current status of development}

		@@ -799,7 +806,7 @@ An example output is shown in Figure~\ref{fig-bls}. The left half is the
799	806	baseline of Japanese characters is shifted down. On the other
800	807	hand, the right half is the output when
801	808	\textsf{yalbaselineshift} is positive, hence the baseline of
802		- alphabetic characters is shifted. Figure~\ref{fig-small}
	809	+ alphabetic characters is shifted down. Figure~\ref{fig-small}
803	810	shows an intresting use of these parameters.
804	811
805	812	\end{description}

		@@ -856,12 +863,12 @@ To work this behavior well, a list of all (alphabetic) encodings defined
856	863	\subsection{Classes for Japanese documents}
857	864	To produce `high-quality' Japanese documents, we need not only that
858	865	Japanese characters are correctly placed, but also class files for
859		-Japanese documents. In \pTeX, there are two major families of classes:
	866	+Japanese documents. Two major families of classes are widely used in Japan:
860	867	\emph{jclasses} which is distributed with the official p\LaTeXe\ macros,
861	868	and \emph{jsclasses}. At the present, \LuaTeX-ja
862	869	simply contains their counterparts: \emph{ltjclasses} and
863		-\emph{ltjsclasses}. However, the policy on classess is not determined
864		-now, and we hope to have another family of classes which are useful in
	870	+\emph{ltjsclasses}. However, the policy on classes is not determined
	871	+now, and we hope to have another family of classes which are useful for
865	872	commercial printing. In the author's opinion, \emph{ltjclasses} is
866	873	better to stay as an example of porting of class files for \pTeX\ to
867	874	\LuaTeX-ja.

		@@ -885,18 +892,20 @@ the former two packages.
885	892	control sequences producing Unicode characters.
886	893
887	894	\item[The \emph{otf} package]
888		-This package is widely used in \pTeX\ for characters which is
	895	+This package is widely used in \pTeX\ for typesetting characters which is
889	896	not in JIS~X~0208, and for using more than one weight in \emph{mincho}
890	897	and \emph{gothic} font families. Therefore \LuaTeX-ja supports features
891	898	in the \emph{otf} package, by loading \texttt{luatexja-\penalty0otf.sty}
892	899	manually. Note that characters by \|\UTF{xxxx}\| and
893	900	\|\CID{xxxx}\| are not appended to the current list as a
894		- \emph{glyph\_node}, so they are not affected by callbacks by
895		- the \emph{luaotfload} package. We have another remark; \|\CID\|
896		- does not work with TrueType fonts.
	901	+ \emph{glyph\_node}, to avoid from callbacks by the
	902	+ \emph{luaotfload} package. We have another remark; \|\CID\|
	903	+ does not work with TrueType fonts, since \|\CID\| use the
	904	+ conversion table between CID and the glyph order of the
	905	+ current Japanese font.
897	906
898	907	\item[The \emph{listings} package]
899		-It is known for users of \pTeX that there is a patch \|jlisting.sty\| for
	908	+It is known for users of \pTeX\ that there is a patch \|jlisting.sty\| for
900	909	the \emph{listings} package, to use Japanese characters in
901	910	the \|lstlisting\| environment. Generally speaking, it also can
902	911	be used in \LuaTeX-ja. However, it seems to be that a

		@@ -905,11 +914,11 @@ It is known for users of \pTeX that there is a patch \|jlisting.sty\| for
905	914	use the \emph{showexpl} package.
906	915
907	916	There is another way to use characters above 256 with the
908		- \emph{listings} package (described in\cite{apl}), however,
	917	+ \emph{listings} package (described in\cite{apl}). However,
909	918	this method is not suitable for Japanese, since the number of
910	919	Japanese characters is very large. We hope that the
911		- \emph{listings} package will be able to cope with all characters above
912		- 256 in the future.
	920	+ \emph{listings} package will be able to handle all characters above
	921	+ 256 without any patch, in the future.
913	922
914	923
915	924	\end{description}

		@@ -917,10 +926,11 @@ There is another way to use characters above 256 with the
917	926
918	927
919	928	\section{Implementation}
	929	+\label{sec:implementation}
920	930	\subsection{Handling of Japanese fonts}
921	931	In \pTeX, there are three slots for maintaining current fonts, namely
922		-\|\font\| for alphabetic fonts, \|\jfont\| for Japanese font (in horizontal
923		-direction) and \|\tfont\| for Japanese font (in vertical direction). With
	932	+\|\font\| for alphabetic fonts, \|\jfont\| for Japanese fonts (in horizontal
	933	+direction) and \|\tfont\| for Japanese fonts (in vertical direction). With
924	934	these slots, we can manage the current font for alphabetic characters
925	935	and that for Japanese characters separately in \pTeX. However, \LuaTeX\
926	936	has only one slot for maintaining the current font, as \TeX82. This

		@@ -947,7 +957,7 @@ they cannot be an argument of \|\the\|, \|\fontname\|, nor \|\textfont\|.
947	957
948	958	Callbacks by the \emph{luaotfload} package, e.g.,~replacement of glyphs
949	959	according to font features, are executed just after `Examination of
950		-Stack Level' (see Subsection~\ref{ssec-over}). Note that calculation of
	960	+Stack Level' (see Subsections \ref{ssec-over}~and~\ref{ssec-stack}). Note that calculation of
951	961	character classes for each Japanese character is done \emph{after} the
952	962	these callbacks for now.
953	963

		@@ -955,10 +965,10 @@ these callbacks for now.
955	965	\label{ssec-stack}
956	966
957	967	As we noted in Subsection~\ref{ssec-csname}, parameters that the values
958		-at the end of a horizontal box or that of a paragraph are effective in
	968	+at the end of a horizontal box or that of a paragraph are valid in
959	969	whole box or paragraph, such as \emph{kanjiskip}, cannot be implemented
960	970	by internal integers or registers of other types in \TeX. We explain it
961		-in this section.
	971	+in this subsection.
962	972
963	973	\begin{figure}
964	974	\begin{lstlisting}

		@@ -1039,7 +1049,7 @@ needed. In the context of \pTeX, this process was performed using virtual fonts.
1039	1049	On the other hand, Lua\TeX-ja does the adjustment by encapsuling a glyph
1040	1050	into a horizontal box. There are two main reasons why we adopted this
1041	1051	method; one is that we feared Lua codes for coexisting with callbacks by
1042		-\|luaotfload\| package would be large if we use virtual fonts, and the
	1052	+the \|luaotfload\| package would be large if we use virtual fonts, and the
1043	1053	other is to cope with shifting of the baseline of characters at the
1044	1054	same time.
1045	1055

		@@ -1093,29 +1103,32 @@ same time.
1093	1103	\end{figure}
1094	1104
1095	1105	Figure~\ref{fig-pos} shows the adjustment process. A large square $M$ is
1096		-the imaginary body which is specified in the metric, and a vertical
	1106	+the imaginary body specified in the metric, and a vertical
1097	1107	rectangle is the imaginary body of a real glyph. First, the real glyph
1098	1108	is aligned with respect to the width of $M$. In the figure, the real
1099	1109	glyph is aligned `middle'; this setting is useful for the full-width
1100		-middle dot `・'. We have other settings, namely, `left' and `right'.
	1110	+middle dot `・'. We have other settings, `left' and `right'.
1101	1111	After that, it is shifted according to the value of \|left\| and \|down\|,
1102		-which are specified in the metric. The final position of the real glyph
	1112	+which are specified in the metric, too. The final position of the real glyph
1103	1113	is shown by the gray rectangle~$R$. If the amount of shifting the baseline is
1104	1114	not zero, $M$ (and hence the real glyph) is shifted by that amount.
1105	1115
1106		-We would like to remark briefly about the vertical position of a glyph.
1107		-A JFM (or the metric used in \LuaTeX-ja) and the real font used for it
1108		-may have different height or depth. In that case, it may look better if
1109		-the real glyph is shifted vertically to match the height-depth ratio
1110		-specified in the metric. This situation is carefully studied by
	1116	+We would like to remark briefly on the vertical position of a real
	1117	+glyph. A JFM (or a metric used in \LuaTeX-ja) and a real font used for
	1118	+it may have different height or depth. In that case, it may look better
	1119	+if the real glyph is shifted vertically to match the height-depth ratio
	1120	+specified in the metric, while any vertical adjustment except the
	1121	+adjustment by the \|down\| value does not performed in the present
	1122	+implementation of \LuaTeX-ja . This situation is carefully studied by
1111	1123	Otobe~\cite{min10}. Here the policy on this problem is not determined
1112		-now, however we would like to offer several solutions in future development.
	1124	+now, however we would like to offer several solutions in future
	1125	+development.
1113	1126
1114	1127	\section{Conclusion}
1115	1128	We have discussed about our \LuaTeX-ja package, which is much affected
1116	1129	by \pTeX. For now, it can be used for experimental use, however there
1117	1130	are much refinements which are needed for regular use. The author hopes
1118		-that this paper and this project contribute the typesetting Japanese,
	1131	+that this paper and \LuaTeX-ja project contribute the typesetting Japanese,
1119	1132	and possibly other Asian languages, under \LuaTeX.
1120	1133
1121	1134	\section*{Acknowledgements}

旧リポジトリブラウザで表示

LuaTeX-ja Fork

luatexja: コミット

コミットメタ情報

ログメッセージ

変更サマリ

差分

LuaTeX-ja
Fork