• R/O
  • HTTP
  • SSH
  • HTTPS

luatexja: コミット

ソースコードの管理場所


コミットメタ情報

リビジョンd2fc66f97142cda1a1c89baa11e18df6b5e0bc1b (tree)
日時2011-11-22 07:35:00
作者KUROKI Yusuke <kuroky@user...>
コミッターKUROKI Yusuke

ログメッセージ

Updated the draft for post-proceedings.

変更サマリ

差分

--- a/doc/ajt-devel-ltja.tex
+++ b/doc/ajt-devel-ltja.tex
@@ -1,1230 +1,1238 @@
1-%#!lualatex ajt-devel-ltja
2-\documentclass{ajt}
3-
4-%%% Packages used in this paper
5-
6-%%% Font setting for \LuaTeX; this is extract from ajt.cls
7-\makeatletter
8- \if@print
9- \RequirePackage{fontspec,xunicode}
10- \RequirePackage{luatextra}
11- \setmainfont[Mapping=tex-text]{Palatino LT Std}
12- \setsansfont[Mapping=tex-text]{Optima LT Std}
13- \else
14- \RequirePackage{fontspec,luatextra}
15- \setmainfont[Mapping=tex-text]{TeX Gyre Pagella} % \simeq Palatino
16- \fi
17-
18-%%% LuaTeX-ja
19-\usepackage{luatexja,luatexja-fontspec}
20-\ltjsetparameter{jacharrange={-3,-8}}
21-\DeclareFontShape{JY3}{mc}{m}{n}{<-> s*[0.92489] file:ipam.ttf:jfm=ujis}{}
22-\DeclareFontShape{JY3}{gt}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=ujis}{}
23-% quick hack: monospaced Japanese font by \ttfamily
24-\DeclareKanjiFamily{JY3}{\ttdefault}{}{}
25-\DeclareFontShape{JY3}{\ttdefault}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=mono}{}
26-
27-
28-%%% LTXexample environment
29-\usepackage{showexpl,lltjlisting}
30-\lstset{basicstyle=\ttfamily\small, width=0.3\textwidth, basewidth=.5em}
31-
32-%%% Verbatim environment
33-\usepackage{fancyvrb}
34-\CustomVerbatimEnvironment{code}{Verbatim}%
35-{numbers=left,xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small}
36-\CustomVerbatimEnvironment{codewithoutnum}{Verbatim}%
37-{xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small}
38-\CustomVerbatimEnvironment{codewithoutnumsmall}{Verbatim}%
39-{xleftmargin=1.5em,baselinestretch=1.0,fontsize=\footnotesize}
40-\DefineShortVerb{\|}
41-
42-%%% Others
43-\usepackage{mflogo,booktabs}
44-\definecolor{grayx}{gray}{0.85}
45-
46-%%% Mandatory article metadata %%%
47-\title{Development of \LuaTeX-ja package}
48-\author{Hironori Kitagawa {\normalsize 北川 弘典}}
49-\address{\LuaTeX-ja project team}
50-\email{h\_kitagawa2001@yahoo.co.jp}
51-
52-\keywords{\TeX, p\TeX, \LuaTeX, \LuaTeX-ja, Japanese}
53-\abstract{%
54-\LuaTeX-ja package is a macro package for typesetting Japanese
55-documents under \LuaTeX. The package has more flexibility of
56-typesetting than \pTeX, which is widely used Japanese extension of \TeX,
57-and has corrected some unwanted features of \pTeX.
58-In this paper, we describe specifications, the current status and some
59-internal processing methods of \LuaTeX-ja.
60-}
61-
62-\newcommand{\parname}[1]{\textsf{#1}}
63-\newcommand{\jstrut}{\vrule width0pt height\cht depth\cdp}
64-\newcommand{\imagfm}[1]{\ifvmode\leavevmode\fi%
65- \hbox{\fboxsep=0pt\fbox{\setbox0=\hbox{#1}\copy0\kern-\wd0
66- \smash{\vrule width \wd0 height 0.4pt depth0.4pt}}}}
67-\begin{document}
68-
69-%%% Do not forget to start with \maketitle!
70-\maketitle
71-
72-\section{Introduction}
73-\subsection{History}
74-To typeset Japanese documents with \TeX, ASCII \pTeX~\cite{ptex} has
75-been widely used in Japan. There are other methods---for example, using
76-Omega and OTP~\cite{omega}, or with the CJK package---to do so, however,
77-these alternative methods did not become majority. The author thinks
78-that this is because \pTeX\ enables us to produce high-quality documents
79-(e.g.,~supporting vertical typesetting), and the appearance of \pTeX\ is
80-earlier than that of alternatives described above.
81-
82-However, \pTeX\ has been left behind from the extensions of \TeX\ such
83-as \eTeX\ and \pdfTeX, and the diffusion of UTF-8 encoding. In recent
84-years, the situation has become better, by development of
85-|ptexenc|~\cite{ptexenc} by Nobuyuki Tsuchimura (\hbox{土村展之}),
86-$\varepsilon$-\pTeX~\cite{eptex} by the author,~and u\pTeX~\cite{uptex}
87-by Takuji Tanaka (田中琢爾). However, continuing this approach, namely,
88-to develop an engine extension localized for Japanese, is not wise. This
89-approach needs lots of work for \emph{each} engine. In addition, if we
90-use \LuaTeX, the necessity of an engine extension is getting smaller
91-because \LuaTeX\ has an ability to hook \TeX's internal process by using
92-Lua callbacks.
93-
94-
95-There were several experimental attempts to typeset
96-Japanese documents with \LuaTeX\ before. Here we cite three examples:
97-\begin{itemize}
98-\item |luaums.sty|~\cite{luaums} developed by the author. This
99- experimental package is for creating a certain Japanese-based presentation
100- with \LuaTeX.
101-\item the \emph{luajalayout} package~\cite{luajalayout}, formerly known as the
102- \emph{jafontspec} package, by Kazuki Maeda (前田一貴). This package is based on
103- \LaTeXe\ and \emph{fontspec} package.
104-\item the \emph{luajp-test} package~\cite{luajp-test}, a test package made by
105- Atsuhito Kohda (香田温人), based on articles on the web page~\cite{joylua}.
106-\end{itemize}
107-However, these packages are based on \LaTeXe, and do not have much
108-ability to control the typesetting rule. And it is inefficient that more
109-than one people separately develop similar packages. Development of the
110-\LuaTeX-ja package is started initially by the author and Kazuki Maeda, because of
111-these situations.
112-
113-\subsection{Development policy of \LuaTeX-ja}
114-\label{ssec-pol}
115-The first aim of \LuaTeX-ja project was to implement features (from the
116-`primitive' level) of \pTeX\ as macros under \LuaTeX, therefore \LuaTeX-ja is
117-much affected by \pTeX. However, as development proceeded, some
118-technical/conceptual difficulties arose. Hence we changed the aim
119-of the project as follows:
120-\begin{itemize}
121-\item\emph{\LuaTeX-ja offers at least the same flexibility of
122- typesetting that p\TeX\ has.}
123-
124- We are not satisfied with the ability of producing outputs conformed to
125- JIS~X~4051~\cite{jisx4051}, the Japanese Industrial Standard for
126- typesetting, or to a technical note~\cite{w3c} by W3C;
127- if one wants to produce very incoherent outputs for some reason, it
128- should be possible.
129-In this point, previous attempts of Japanese typesetting with \LuaTeX\
130- which we cited in the previous subsection are inadequate.
131-
132-\pTeX\ has some flexibility of typesetting, by changing internal
133- parameters such as |\kanjiskip| or |\prebreakpenalty|, and by using
134- custom JFM (Japanese TFM). Therefore we decided to include these
135- functionality to \LuaTeX-ja.
136-
137-\item\emph{\LuaTeX-ja isn't mere re-implementation or porting of \pTeX;
138- some (technically and/or conceptually) inconvenient features of
139- \pTeX\ are modified.}
140-
141- We describe this point in more detail at the next section.
142-\end{itemize}
143-
144-
145-\subsection{Overview of the processes}
146-\label{ssec-over}
147-We describe an outline of \LuaTeX-ja's process in order.
148-
149-\begin{itemize}
150-\item In the |process_input_buffer| callback: treatment of breaking
151- lines after a Japanese character (in Subsection~\ref{ssec-line}).
152-
153-\item In the |hyphenate| callback: font replacement.
154-
155-\LuaTeX-ja looks into for each \textit{glyph\_node}~$p$ in the horizontal list. If
156- the character represented by $p$ is considered as a Japanese
157- character, the font used at $p$ is replaced by the value of
158- |\ltj@curjfnt|, an attribute for `the current Japanese font'
159- at~$p$.
160-
161-Furthermore, the subtype of $p$ is subtracted by 1 to suppress
162- hyphenation around $p$ by \LuaTeX, because later processes of
163- \LuaTeX-ja take care of all things about Japanese characters.
164-
165-\item In |pre_linebreak_filter| and |hpack_filter| callbacks:
166-
167-\begin{enumerate}
168-\item \LuaTeX-ja has its own stack system, and the current horizontal
169- list is traversed in this stage to determine what the level of
170- \LuaTeX-ja's internal stack at the end of the list is. We will
171- discuss it in Subsection~\ref{ssec-stack}.
172-
173-\item In this stage, \LuaTeX-ja inserts glues/kerns for Japanese
174- typesetting in the list. This is the core routine of \LuaTeX-ja.
175- We will discuss it in Subsections
176- \ref{ssec-jglue}~and~\ref{ssec-jspec} .
177-
178-\item To make a match between a metric and a real font, sometimes
179- adjustument of the position of (Japanese) glyphs are performed.
180- We will discuss it in Subsection~\ref{ssec-width}.
181-\end{enumerate}
182-\item In the |mlist_to_hlist| callback: treatment of Japanese characters
183- in math formulas. This stage is similar to adjustment of the
184- position of glyphs (see above), so we omit to describe this stage
185- from this paper.
186-\end{itemize}
187-
188-In this paper, a \emph{alphabetic character} means a non-Japanese
189-character. Similarly, we use the word an \emph{alphabetic font} as the
190-counterpart of a jJpanese font.
191-
192-\subsection{Contents of this paper}
193-Here we describe the contents of the rest of this paper briefly. In
194-Section~\ref{sec:differences_with_ptex}, we describe major differences
195-between \pTeX\ and \LuaTeX-ja. The next section,
196-Section~\ref{sec:distinction_of_characters}, is concentrated on a
197-problem how we distinguish between Japanese characters and alphabetic
198-characters. In Section~\ref{sec:current_status}, we show current
199-development status of the package. Finally, in
200-Section~\ref{sec:implementation}, we describe some internal routines of
201-\LuaTeX-ja.
202-
203-\subsection{General information of the project}
204-This \LuaTeX-ja project is hosted by SourceForge.jp. The official wiki
205-is located on
206-\url{http://sourceforge.jp/projects/luatex-ja/wiki/}. There is
207-no stable version on October 22, 2011, however a set of developer sources can be
208-obtained from the git repository. Members of the project team are as follows
209-(in random order): Hironori Kitagawa, Kazuki Maeda, Takayuki Yato,
210-Yusuke Kuroki, Noriyuki Abe, Munehiro Yamamoto, Tomoaki Honda,
211-and~Shuzaburo Saito.
212-
213-
214-\section{Major differences with \pTeX}
215-\label{sec:differences_with_ptex}
216-In this section, we explain several major differences between \pTeX\
217-and our \LuaTeX-ja. For general information of Japanese typesetting and the
218-overview of \pTeX, please see Okumura~\cite{ptexjp}.
219-
220-
221-\subsection{Names of control sequences}
222-\label{ssec-csname} Because \pTeX\ is an engine modification of Knuth's
223-original \TeX82 engine, some of the additional primitives take a form that is
224-very difficult to be simulated by a macro. For example, an additional
225-primitive |\prebreakpenalty|$\langle\hbox{\it
226-char\_code}\rangle$|[=]|$\langle\hbox{\it penalty}\rangle$ in \pTeX\
227-sets the amount of penalty inserted before a character whose code is
228-$\langle\hbox{\it char\_code}\rangle$ to $\langle\hbox{\it
229-penalty}\rangle$, and this form |\prebreakpenalty|$\langle\hbox{\it
230-char\_code}\rangle$ can be also used for retrieving the value.
231-
232-Moreover, there are some internal parameters of \pTeX\ which values of them at the end of a
233-horizontal box or that of a paragraph are valid in whole box or
234-paragraph. However, the implementation of these parameters in
235-\LuaTeX-ja is not so easy; we will discuss it in Subsection~\ref{ssec-stack}.
236-
237-From above two~problems discussed above, the assignment and retrieval
238-of most parameters in \LuaTeX-ja are summarized into the following
239-three~control sequences:
240-\begin{itemize}
241-\item |\ltjsetparameter{|$\langle\hbox{\it
242- name}\rangle$|=|$\langle\hbox{\it value}\rangle$|,...}|: for local
243- assignment.
244-\item |\ltjglobalsetparameter|: for global assignment. Note that these two control
245- sequences obey the value of |\globaldefs| primitive.
246-\item |\ltjgetparameter{|$\langle\hbox{\it
247- name}\rangle$|}[{|$\langle\hbox{\it optional
248- argument}\rangle$|}]|: for retrieval. The returned value is always
249- a string.
250-\end{itemize}
251-
252-\subsection{Line-break after a Japanese character}
253-\label{ssec-line}
254-
255-Japanese texts can break lines almost everywhere, in contrast with
256-alphabetic texts can break lines only between words (or use
257-hyphenation). Hence, \pTeX's input processor is modified so that a
258-line-break after a Japanese character doesn't emit a space. However,
259-there is no way to customize the input processor of \LuaTeX, other than
260-to hack its CWEB-source. All a macro package can do is to modify an input line before
261-when \LuaTeX\ begin to process it, inside the |process_input_buffer|
262-callback.
263-
264-Hence, in \LuaTeX-ja, a comment letter (we reserve U+FFFFF for this
265-purpose) will be appended to an input line, if this line ends with a Japanese
266-character.\footnote{Strictly speaking, it also requires that the catcode
267-of the end-line character is 5~(\emph{end-of-line}). This condition is
268-useful under the verbatim environment.} One might jump to a conclusion
269-that the treatment of a line-break by \pTeX\ and that of \LuaTeX-ja are
270-totally same, however they are different in the respect that \LuaTeX-ja's
271-judgement whether a comment letter will be appended the line is done
272-\emph{before} the line is actually processed by \LuaTeX.
273-
274-Figure~\ref{fig-linebreak} shows an example of this situation; the
275-command at the first line marks most of Japanese characters as
276-`non-Japanese characters'. In other words, from that command onward, the
277-letter `あ' will be treated as an alphabetic character by
278-\LuaTeX-ja. Then, it is natural to have a space between `あ' and `y' in
279-the output, where the actual output in the figure does not so. This is
280-because `あ' is considered a Japanese character by \LuaTeX-ja,
281-when \LuaTeX-ja does the decision whether U+FFFFF will be added to the
282-input line~2.
283-
284-\begin{figure}
285-\begin{LTXexample}
286-\font\x=IPAMincho \x
287-\ltjsetparameter{jacharrange={-6}}xあ
288-y
289-\end{LTXexample}
290-\caption{A notable sample showing the treatment of a line-break after a
291-Japanese character.}\label{fig-linebreak}
292-\end{figure}
293-
294-\subsection{Separation between `real' fonts and metrics}
295-\label{ssec-sepmet}
296-
297-Traditionally, most Japanese fonts used in typesetting are not
298-proportional, that is, most glyphs have same size (in most cases,
299-square-shaped). Hence, it is not rare that the contents of different
300-JFMs are essentially same, and only differ in their names. For example,
301-|min10.tfm| and |goth10.tfm|, which are JFMs shipped with \pTeX\ for
302-seriffed \emph{mincho} family and sans-seriffed \emph{gothic} family,
303-differ their |FAMILY| and |FACE| only. Moreover, |jis.tfm| and
304-|jisg.tfm|, which is included in the \emph{jis} font metric, which is
305-used in \emph{jsclasses}~\cite{jsclasses} by Haruhiko Okumura (奥村晴彦),
306-are totally same as binary files. Considering this situation, we
307-decided to separate `real' fonts and metrics used for them in
308-\LuaTeX-ja. Typical declarations of Japanese fonts in the style of plain
309-\TeX\ are shown in Figure~\ref{fig-jfdef}. We would like to add several
310-remarks:
311-\begin{itemize}
312-\item A control sequence |\jfont| must be used for Japanese fonts, instead of |\font|.
313-\item \LuaTeX-ja automatically loads the \emph{luaotfload} package, so
314- \hbox{\tt file:} and \hbox{\tt name:} prefixes, and various font features can be
315- used as the first line in Figure~\ref{fig-jfdef}.
316-\item The |jfm| key specifies the metric for the font. In
317- Figure~\ref{fig-jfdef}, both fonts will use a metric stored in a
318- Lua script named |jfm-ujis.lua|. This metric is the standard
319- metric in \LuaTeX-ja, and is based on JFMs used in the \emph{otf}
320- package~\cite{otf}.
321-\item The \hbox{psft:} prefix can be used to specify name-only, non-embedded
322- fonts. When one display a pdf with these fonts, actual fonts which
323- will be used for them depend on a pdf reader.
324-\end{itemize}
325-The specification of a metric for \LuaTeX-ja is similar to that of a JFM
326-(see \cite{ptexjp}); characters are grouped into several classes, the
327-size information of characters are specified for each class, and
328-glue/kern insertions are specified for each pair of classes. Although
329-the author have not tried, it may be possible to develop a program that
330-`converts' a JFM to a metric for \LuaTeX-ja. \LuaTeX-ja offers three
331-metrics by default; |jfm-ujis.lua|, |jfm-jis.lua| based on the
332-\emph{jis} font metric, and |jfm-min.lua| based on old |min10.tfm|.
333-
334- Note that |-kern| in features
335-is important, because kerning information from a real font itself will
336-clash with glue/kern informations from the metric.
337-
338-\begin{figure}
339-\begin{verbatim}
340-\jfont\foo=file:ipam.ttf:jfm=ujis;script=latn;-kern;+jp04 at 12pt
341-\jfont\bar=psft:Ryumin-Light:jfm=ujis at 10pt
342-\end{verbatim}
343-\caption{Typical declarations of Japanese fonts.}
344-\label{fig-jfdef}
345-\end{figure}
346-
347-\subsection{Insertion of glues/kerns for Japanese typesetting: timing}
348-\label{ssec-jglue}
349-
350-As described in \cite{luatexref}, \LuaTeX's kerning and ligaturing
351-processes are totally different from those of \TeX82. \TeX82's process is
352-done just when a (sequence of) character is appended to the current
353-list. Thus we can interrupt this process by writing as
354-|f{}irm|. However, \LuaTeX's process is \emph{node-based}, that is, the
355-process will be done when a horizontal box or a paragraph is ended, so
356-|f{}irm| and |firm| yield same outputs under \LuaTeX.
357-
358-The situation for Japanese characters is more complicated.
359-Glues (and kerns) which are needed for Japanese
360-typesetting are divided into the following three categories:
361-\begin{itemize}
362-\item Glue (or kern) from the metric of Japanese fonts (\emph{JFM glue},
363- for short).
364-
365-\item Default glue between a Japanese character and an alphabetic
366- character (\emph{xkanjiskip}, for short), usually 1/4 of
367- full-width (\emph{shibuaki}) with some stretch and shrink for
368- justifying each line.
369-\item Default glue between two consecutive Japanese characters
370- (\emph{kanjiskip}, for short). The main reason of this glue is to
371- enable breaking lines almost everywhere in Japanese texts. In most
372- cases, its natural width is zero, and some stretch/shrink for
373- justifying each line.
374-\end{itemize}
375-In \pTeX, these three kinds of glues are treated differently. A JFM glue
376-is inserted when a (sequence of) Japanese character is appended to the
377-current list, same as the case of alphabetic characters in \TeX82. This
378-means that one can interrupt the insertion process by saying |{}|. A
379-\emph{xkanjiskip} is inserted just before `hpack' or line-breaking of a
380-paragraph; this timing is somewhat similar to that of \LuaTeX's kerning
381-process. Finally, A \emph{kanjiskip} is not appeared as a node anywhere;
382-only appears implicitly in calculation of the width of a horizontal box,
383-that of breaking lines, and the actual output process to a DVI
384-file. These specifications made \pTeX's behavior very hard to
385-understand.
386-
387-\LuaTeX-ja inserts glues in all three categories simultaneously inside
388-|hpack_filter| and |pre_linebreak_filter| callbacks. The reasons of
389-this specification are to behave like alphabetic characters in \LuaTeX\
390-(as described in the first paragraph), and to clarify the specification
391-for \LuaTeX-ja's process.
392-
393-\subsection{Insertion of glues/kerns for Japanese typesetting: specification}
394-\label{ssec-jspec}
395-
396-\begin{table}
397-\caption{Examples of differences between \pTeX\ and \LuaTeX-ja.}
398-\label{tab-jfmglue}
399-\begin{center}
400-\begin{tabular}{llllllll}
401-\toprule
402-&\multicolumn{1}{c}{(1)}&\multicolumn{1}{c}{(2)}&\multicolumn{1}{c}{(3)}&\multicolumn{1}{c}{(4)}\\
403-Input &|あ】{}【〙\/〘| &|い』\/a| &|う)\hbox{}(| &|え]\special{}[|\\\midrule
404-\pTeX &あ】\hbox{}【〙\hbox{}〘&い』\/a &う)\hbox{}( &え]\hbox{}[\\
405-\LuaTeX-ja &あ】{}【〙\/〘 &い』\/a &う)\hbox{}( &え]\special{}[\\
406-\bottomrule
407-\end{tabular}
408-\end{center}
409-\end{table}
410-
411-\begin{figure}
412-\begin{center}
413-\fontsize{40}{40}\selectfont
414-\imagfm{\jstrut あ}%
415-\imagfm{\jstrut 】\inhibitglue}%
416-\imagfm{\jstrut\kern.5\zw}%
417-\imagfm{\jstrut\kern.5\zw}%
418-\imagfm{\jstrut\inhibitglue【}%
419-\imagfm{\jstrut 〙\inhibitglue}%
420-\imagfm{\jstrut\kern.5\zw}%
421-\imagfm{\jstrut\kern.5\zw}%
422-\imagfm{\jstrut\inhibitglue〘}%
423-\end{center}
424-\caption{Detail of (1) in Table~\ref{tab-jfmglue}.}
425-\label{fig-ptexjfm}
426-\end{figure}
427-
428-Now we will take a look inside the insertion process itself, and describe 4~points.
429-
430-\begin{description}
431-\item[Ignored Nodes]
432-As noted in the previous subsection, the insertion process in \pTeX\ can
433- be interrupted by saying |{}| or anything else.\footnote{This
434- is why some tricks like \texttt{ちょ\char`\{\char`\}っと} for
435- \texttt{min10.tfm} and other `old' JFMs work.} This leads the
436- second row in Table~\ref{tab-jfmglue}, or
437- Figure~\ref{fig-ptexjfm}. Here `the process is interrupted'
438- means that \pTeX\ does not think the letter `】\inhibitglue'
439- is followed by `\inhibitglue【', hence two half-width glues
440- are inserted between `】\inhibitglue' and `\inhibitglue【',
441- where the left one is from `】\inhibitglue' and the right one
442- is from `\inhibitglue【'.
443-
444- On the other hand, in \LuaTeX-ja, the process is done inside
445- |hpack_filter| and |pre_linebreak_filter| callbacks. Hence,
446- \emph{anything that does not make any node will be
447- ignored}\ in \LuaTeX-ja, as shown in (1) in
448- Table~\ref{tab-jfmglue}. \LuaTeX-ja also ignores any nodes
449- which does not make any contribution to current horizontal
450- list---\emph{ins\_node}, \emph{adjust\_node},
451- \emph{mark\_node}, \emph{whatsit\_node} and
452- \emph{penalty\_node}---, as shown in (4).
453-
454-
455-By the way, around a \emph{glyph\_node} $p$ there may be some nodes
456- attached to $p$. These are an accent and kerns for
457- moving it to the right place, and a kern from the italic
458- correction\footnote{\TeX82 (and \LuaTeX) does not distinguish
459- between explicit kern and a kern for italic correction. To
460- distinguish them, an additional subtype for a kern is introduced
461- in \pTeX. On the other hand, \LuaTeX-ja uses an additional attribute and
462- redefines \texttt{\char`\\/} to set this attribute.} for $p$. It is natural that
463- these attachments should be ignored inside the process. Hence
464- \LuaTeX-ja takes this approach, as the latest version of
465- \pTeX\ (p3.2). This explains (2) in the figure.
466-
467-Summerizing above, one should put an empty horizontal box |\hbox{}| to
468- where he wants to interrupt the insertion process in
469- \LuaTeX-ja as (3) in the figure.
470-
471-\item[Fonts with the Same Metric]
472-Recall that \LuaTeX-ja separated `real' fonts and metrics, as in Subsection~\ref{ssec-sepmet}.
473-Consider the following input, where all Japanese fonts use same metric
474- (in \LuaTeX-ja), and |\gt| selects \emph{gothic} family for
475- the current Japanese font family:
476-\begin{quote}
477-\begin{verbatim}
478-明朝)\gt (ゴシック
479-\end{verbatim}
480-\end{quote}
481-If the above input is processed by \pTeX, because the insertion process is
482- interrupt by |\gt|, the result looks like
483-\begin{quote}
484-\mc 明朝)\hbox{}\gt (ゴシック
485-\end{quote}
486-However this seems to be unnatural, since two Japanese fonts in the
487- output use the same metric, i.e.,~the same
488- typesetting rule. Hence, we decided that Japanese fonts with
489- the same metric are treated as one font in the insertion
490- process of \LuaTeX-ja. Thus, the output from the above input
491- in \LuaTeX-ja looks like:
492-\begin{quote}
493-\mc 明朝)\gt (ゴシック
494-\end{quote}
495-One might have the situation that this default behavior is not
496- suitable. \LuaTeX-ja offers a way to handle this situation, but
497- we leave it to the manual~\cite{man}.
498-
499-\item[Fonts with Different Metrics]
500-In the case where two consecutive Japanese characters use different metrics and/or
501- different size is similar. Consider the following input where
502- the \emph{mincho} family and the \emph{gothic} family use
503- different metrics:
504-\begin{quote}
505-\begin{verbatim}
506-漢)\gt (漢)\large (大
507-\end{verbatim}
508-\end{quote}
509-As the previous paragraph, this input yields the following, by \pTeX:
510-\begin{quote}
511-\mc 漢)\hbox{}\gt (漢)\hbox{}\large (大
512-\end{quote}
513-We thought that amounts of spaces between parentheses in above output
514- are too much. Hence we changed the default behavior of
515- \LuaTeX-ja, so that the amount of a glue between two Japanese
516- characters with different metrics is the \emph{average} of a glue
517- from the left character and that from the right
518- character. For example, Figure~\ref{fig-diffmet} shows the
519- output from above input. The width of glue indicated `(1)' is
520- $(a/2 + a/2)/2 = 0.5a$, and the width of glue indicated `(2)'
521- is $(a/2 + 1.2a/2)/2 = 0.55a$. This default behavior can be
522- changed by \textsf{diffrentmet} parameter of \LuaTeX-ja.
523-
524-\begin{figure}
525-\begin{center}
526-\fontsize{40}{40}\selectfont
527-\imagfm{\jstrut\smash{%
528- \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr漢\cr
529- \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$a$}\
530- \hrulefill\vrule height .5ex depth .5ex\cr}}}}%
531-\imagfm{\jstrut )\inhibitglue}%
532-\hbox to .5\zw{\hss\normalsize (1)\hss}%
533-\imagfm{\jstrut\inhibitglue\gt (}%
534-\imagfm{\jstrut\gt 漢}%
535-\imagfm{\jstrut\gt )\inhibitglue}%
536-\hbox to .55\zw{\hss\normalsize (2)\hss}%
537-\imagfm{\fontsize{48}{48}\selectfont\jstrut\gt\inhibitglue (}%
538-\imagfm{\fontsize{48}{48}\selectfont\jstrut\smash{%
539- \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr\gt 大\cr
540- \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$1.2a$}\
541- \hrulefill\vrule height .5ex depth .5ex\cr}}}}
542-\end{center}
543-\caption{Fonts with different metrics.}
544-\label{fig-diffmet}
545-\end{figure}
546-
547-\item[\emph{kanjiskip} and \emph{xkanjiskip}]
548-In \pTeX, the value of \emph{xkanjiskip} is controlled by a skip named
549- |\xkanjiskip|. A well-known defect of this implementation is
550- that the value of \emph{xkanjiskip} is not connected with the
551- size of the currnt Japanese font. It seems that |EXTRASPACE|,
552- |EXTRASTRETCH|, |EXTRASHRINK| parameters in a JFM are
553- reserved for specifying the default value of
554- \emph{xkanjiskip} in a unit of the design size, but \pTeX\
555- did not use these parameters, actually.
556-
557-Considering this situation of p\TeX, \LuaTeX-ja can use the value of
558- \emph{xkanjiskip} that specified in a metric. If the value of
559- \emph{xkanjiskip} on user side (this is the value of
560- \textsf{xkanjiskip} parameter of |\ltjsetparameter|) is
561- |\maxdimen|, then \LuaTeX-ja use the specification from
562- the current used metric as the actual value of
563- \emph{xkanjiskip}. This description also applies for \emph{kanjiskip}.
564-\end{description}
565-
566-\section{Distinction of characters}
567-\label{sec:distinction_of_characters} Since \LuaTeX\ can handle Unicode
568-characters natively, it is a major problem that how we distinguish
569-Japanese characters and alphabetic characters. For example, the
570-multiplication sign (U+00D7) exists both in ISO-8859-1 (hence in Latin-1
571-Supplement in Unicode) and in the basic Japanese character set
572-JIS~X~0208. It is not desirable that this character is always treated as
573-an alphabetic character, because this symbol is often used in the sense
574-of `negative' in Japan.
575-
576-\subsection{Character ranges}
577-Before we describe the approach taken is \LuaTeX-ja, we review the
578-approach taken by u\pTeX. u\pTeX\ extends the |\kcatcode| primitive in
579-\pTeX, to use this primitive for setting how a character is treated
580-among alphabetic characters~(15), \emph{kanji}~(16), \emph{kana}~(17),
581-\emph{kanji}, \emph{Hangul}~(17), or~\emph{other CJK characters}~(18).
582-The assignment to |\kcatcode| can be done by a Unicode
583-block.\footnote{There are some exceptions. For example, U+FF00--FFEF
584-(Halfwidth and Fullwidth Forms) are divided into three blocks in recent
585-u\pTeX.}
586-
587-\LuaTeX-ja adopted a different approach. There are many Unicode blocks
588- in Basic Multilingual Plane which are not included in
589- Japanese fonts, therefore it is inconvenient if we process by a Unicode
590- block. Furthermore, JIS~X~0208 are not just union of Unicode
591- blocks; for example, the intersection of JIS~X~0208 and
592- Latin-1 Supplement is shown in
593- Table~\ref{tab-inter}. Considering these two points, to
594- customize the range of Japanese characters in \LuaTeX-ja, one
595- has to define ranges of character codes in his source in advance.
596-
597-
598-\begin{table}
599-\caption{Intersection of JIS~X~0208 and Latin-1 Supplement.}
600-\label{tab-inter}
601-\begin{center}
602-\begin{tabular}{llll}
603-\ltjjachar"A7 (U+00A7),&
604-\ltjjachar"A8 (U+00A8),&
605-\ltjjachar"B0 (U+00B0),&
606-\ltjjachar"B1 (U+00B1),\\
607-\ltjjachar"B4 (U+00B4),&
608-\ltjjachar"B6 (U+00B6),&
609-\ltjjachar"D7 (U+00D7),&
610-\ltjjachar"F7 (U+00F7)
611-\end{tabular}
612-\end{center}
613-\end{table}
614-
615-%%Example...
616-
617-We note that \LuaTeX-ja offers two additional control sequences,
618- |\ltjjachar| and |\ltjalchar|. They are similar to |\char|
619- primitive, however |\ltjjachar| always yields a Japanese character, provided that
620- the argument is more than or equal to 128, and |\ltjalchar| always
621- yields an alphabetic character, regardless of the argument.
622-
623-\subsection{Default setting of ranges}
624-Patches for plain \TeX\ and \LaTeXe\ of \LuaTeX-ja predefine 8~character
625-ranges, as shown in Table~\ref{tab-chrrng}. Almost of these ranges are
626-just the union of Unicode blocks, and determined from the Adobe-Japan1-6
627-character collection~\cite{aj16}, and JIS~X~0208. Among these 8~ranges,
628-the ranges~2, 3, 6, 7, and~8 are considered ranges of Japanese
629-characters, and others are considered ranges of alphabetic
630-characters\footnote{Note that ranges 3~and~8 are considered ranges of
631-alphabetic characters in this paper.}. We remark on ranges 2~and~8:
632-\begin{description}
633-\item[The range~2]
634-JIS~X~0208 includes Greek letters and Cyrillic letters, however, these
635- letters cannot be used for typesetting Greek or Russian, of
636- course. Hence it is reasonable that Greek letters and
637- Cyrillic consist another character range.
638-\item[The range~8]
639-If one want to use 8-bit TFMs, such as T1 or TS1 encodings, he should
640- mark this range~8 as a range of alphabetic characters by
641-\begin{quote}
642-|\ltjsetparameter{jacharrange={-8}}|
643-\end{quote}
644-This is because some 8-bit TFMs have a glyph in this range; for example,
645- the character `\OE' is located at |"D7| in the T1 encoding. %"
646-\end{description}
647-
648-
649-\begin{table}
650-\caption{Predefined ranges in \LuaTeX-ja.}
651-\label{tab-chrrng}
652-\begin{center}
653-\begin{tabular}{@{\bf}rl}
654-1&(Additional) Latin characters which are not belonged in the range~8.\\
655-2&Greek and Cyrillic letters.\\
656-3&Punctuations and miscellaneous symbols.\\
657-4&Unicode blocks which does not intersect with Adobe-Japan1-6.\\
658-5&Surrogates and supplementary private use Areas.\\
659-6&Characters used in Japanese typesetting.\\
660-7&Characters possibly used in CJK typesetting, but not in Japanese.\\
661-8&Characters in Table~\ref{tab-inter}.
662-\end{tabular}
663-\end{center}
664-\end{table}
665-
666-\subsection{Control sequences producing Unicode characters}
667-\label{ssec-unichar}
668-
669-The \emph{fontspec} package\footnote{Preciously saying, it is the
670-\emph{xunicode} package, originally a package for \XeTeX and
671-automatically loaded by the \emph{fontspec} package.} offers various
672-control sequences that produce Unicode characters. However, these
673-control sequences as it stands cannot work correctly with the default
674-range setting of \LuaTeX-ja. For example, |\textquotedblleft| is just
675-an abbreviation of |\char"201C\relax|, and the character U+201C (LEFT %"
676-DOUBLE QUOTATION MARK) is treated as an Japanese character, because it
677-belongs to the range~3. This problem is resolved by using |\ltjalchar|
678-instead of the |\char| primitive. It is included in an optional package
679-named \texttt{luatexja-\penalty0fontspec.sty}. Figure~\ref{fig-unitxt}
680-shows several ways o typeset a character , both as a Japanese character
681-and as as an alphabetic characters.
682-
683-\begin{figure}
684-\begin{LTXexample}
685-×, \char`×, % depend on range setting
686-\ltjalchar`×, % alphabetic char
687-\ltjjachar`×, % Japanese char
688-\texttimes % alph. char (by fontspec)
689-\end{LTXexample}
690-\caption{Control sequences producing a Unicode character.}
691-\label{fig-unitxt}
692-\end{figure}
693-
694-The situation looks similar in math formulas, but in fact it differs.
695-Each control sequence that represents an ordinary symbol defined by the
696-\emph{unicode-math} package is just synonym of a character. For example,
697-the meaning of |\otimes| is just the character U+2297 (CIRCLED TIMES),
698-which is included in the range~3. However, it is difficult to define a
699-control sequence like |\ltjalUmathchar| as a counterpart of
700-|\Umathchar|, since an input like `|\sum^\ltjalUmathchar ...|' has to be
701-permitted.
702-
703-However, we couldn't develop a satisfactory solution to this problem in
704-time for this paper, due to a lack of time. We are just testing a
705-solution below:
706-\begin{itemize}
707-\item \LuaTeX-ja has a list of character codes which will be always reated as
708- alphabetic characters in math mode. Considering 8-bit TFMs for
709- math symbols, this list includes natural numbers between |"80| and
710- |"FF| by default.
711-\item Redefine internal commands defined in the \emph{unicode-math}
712- package so that
713-codes of characters which are mentioned in the \emph{unicode-math}
714- package will be included in the list.
715-\end{itemize}
716-
717-
718-We would like to extend treatments described in this subsection to 8-bit
719-font encodings, but we leave it to further development too.
720-
721-\section{Current status of development}
722-\label{sec:current_status}
723-At the moment, \LuaTeX-ja can be used under plain \TeX, and under
724-\LaTeXe. Generally speaking, one only has to read |luatexja.sty|, by
725-|\input| command or |\usepackage| (in~\LaTeXe), if you merely want to
726-typeset Japanese characters. We look more detail by parts.
727-
728-\subsection{`Engine extension'}
729-The lowest part of \LuaTeX-ja corresponds the \pTeX\ extension as
730-\emph{an engine extension of \TeX}. We, the project menbers, think that
731-this part is almost done. There is one more feature of \LuaTeX-ja which
732-we are going to explain:
733-
734-\begin{description}
735-\item[Shifting Baseline]
736-In order to make a match between Japanese fonts and alphabetic fonts,
737- sometimes shifting the baseline of alphabetic characters may
738- be needed. \pTeX\ has a dimension |\ybaselineshift|, which
739- corresponds the amount of shifting down the baseline of alphabetic
740- characters. This is useful for Japanese-based documents, but
741- not for documents mainly in languages with alphabetic
742- characters.
743-
744-Hence, \LuaTeX-ja extends \pTeX's |\ybaselineshift| to Japanese
745- characters. Namely, \LuaTeX-ja offers two parameters,
746- \textsf{yjabaselineshift} and \textsf{yalbaselineshift}, for the
747- amount of shifting the baseline of Japanese characters and
748- that of alphabetic characters, respectively.
749-\begin{figure}
750-\begin{center}
751-\fontsize{40}{40}\selectfont\fboxsep0mm
752-\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth
753-\hbox to 0.9\linewidth{%
754-\hfil
755-\raise-10pt\imagfm{\jstrut 漢}%
756-\raise-10pt\imagfm{\jstrut 字}\hskip.25\zw%
757-\imagfm{p}%
758-\imagfm{h}%
759-\hfil\hfil
760-\imagfm{\jstrut 漢}%
761-\imagfm{\jstrut 字}\hskip.25\zw%
762-\raise-10pt\imagfm{p}%
763-\raise-10pt\imagfm{h}%
764-\hfil
765-}
766-\end{center}
767-
768-\caption{First example of shifting baseline.}
769-\label{fig-bls}
770-\end{figure}
771-
772-\begin{figure}
773-\begin{center}
774-\fontsize{30}{30}\selectfont\fboxsep0mm
775-\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth
776-\hbox to 0.9\linewidth{%
777-\hfil
778-\imagfm{a}%
779-\imagfm{b}\hskip.25\zw%
780-\imagfm{\jstrut 本}%
781-\imagfm{\jstrut 文}\hskip.33333\zw%
782-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut\inhibitglue (}%
783-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 注}%
784-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 釈}\hskip.1666667\zw%
785-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont c}%
786-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont o}%
787-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}%
788-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}%
789-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont e}%
790-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont n}%
791-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont t}%
792-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut )\inhibitglue}%
793-\hskip.33333\zw%
794-\imagfm{\jstrut 本}%
795-\imagfm{\jstrut 文}%
796-\hfil
797-}
798-\end{center}
799-
800-\caption{Second example of shifting baseline.}
801-\label{fig-small}
802-\end{figure}
803-
804-An example output is shown in Figure~\ref{fig-bls}. The left half is the
805- output when \textsf{yjabaselineshift} is positive, hence the
806- baseline of Japanese characters is shifted down. On the other
807- hand, the right half is the output when
808- \textsf{yalbaselineshift} is positive, hence the baseline of
809- alphabetic characters is shifted down. Figure~\ref{fig-small}
810- shows an intresting use of these parameters.
811-
812-\end{description}
813-Note that \LuaTeX-ja doesn't support vertical typesetting, \emph{tategaki}, for now.
814-
815-\subsection{Patches for plain \TeX\ and \LaTeXe}
816-\pTeX\ has a patch for plain \TeX, namely |ptex.tex|, that for \LaTeXe\
817-macro (this patch and \LaTeXe\ consist \emph{p\LaTeXe}), and
818-|kinsoku.tex| which includes the default setting of \emph{kinsoku
819-shori}, the Japanese hyphenation. We ported them to \LuaTeX-ja, except
820-the codes related to vertical typesetting, because \LuaTeX-ja doesn't
821-support vertical typesetting yet. We remark one point related to the
822-porting:
823-\begin{description}
824-
825-\item[Behavior of\/ {\tt\char92fontfamily\/}]
826-The control sequence |\fontfamily| in p\LaTeXe\ changes the current alphabetic
827- font family and/or the current Japanese font family,
828- depending the argument. More concretely,
829- |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the
830- current alphabetic font family to $\langle\hbox{\it
831- arg\/}\rangle$, if and only if one of the following
832- conditions are satisfied:
833-\begin{itemize}
834-\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$ in
835- \emph{some} alphabetic encoding already defined in the document.
836-\item There exists an alphabetic encoding $\langle\hbox{\it
837- enc\/}\rangle$ already defined in the document such that a font
838- definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it
839- arg\/}\rangle$|.fd| (all lowercase) exists.
840-\end{itemize}
841-The same criterion is used for changing Japanese font family.
842-
843-To work this behavior well, a list of all (alphabetic) encodings defined
844- already in the document is needed. However, since \LuaTeX-ja
845- is loaded as a package, \LuaTeX-ja cannot have this list.
846- Hence \LuaTeX-ja adopted a different approach, namely
847- |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the
848- current alphabetic font family to $\langle\hbox{\it
849- arg\/}\rangle$, if and only if:
850-\begin{itemize}
851-\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$
852- in the current alphabetic encoding $\langle\hbox{\it
853- enc\/}\rangle$ already defined in the document.
854-\item A font definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it
855- arg\/}\rangle$|.fd| (all lowercase) exists.
856-\end{itemize}
857-
858-
859-\end{description}
860-
861-
862-
863-\subsection{Classes for Japanese documents}
864-To produce `high-quality' Japanese documents, we need not only that
865-Japanese characters are correctly placed, but also class files for
866-Japanese documents. Two major families of classes are widely used in Japan:
867-\emph{jclasses} which is distributed with the official p\LaTeXe\ macros,
868-and \emph{jsclasses}. At the present, \LuaTeX-ja
869-simply contains their counterparts: \emph{ltjclasses} and
870-\emph{ltjsclasses}. However, the policy on classes is not determined
871-now, and we hope to have another family of classes which are useful for
872-commercial printing. In the author's opinion, \emph{ltjclasses} is
873-better to stay as an example of porting of class files for \pTeX\ to
874-\LuaTeX-ja.
875-
876-\subsection{Patches for packages}
877-Apart from patches for the \LaTeXe~kernel and classes for Japanese
878-documents, we need to make patches for several packages. At the present,
879-we considered the following packages, and made patches or porting for
880-the former two packages.
881-
882-\begin{description}
883-\item[The \emph{fontspec} package] The \emph{fontspec} package is built
884- on NFSS2, hence control sequences offered by the
885- \emph{fontspec} package, such as |\setmainfont|, are only
886- effective for alphabetic fonts if \LuaTeX-ja is loaded.
887- \texttt{luatexja-\penalty0fontspec.sty} (not automatically
888- loaded) offers these counterparts for Japanese fonts, with
889- additional `j' in the name of control sequences, such as
890- |\setmainjfont|. As described in
891- Subsection~\ref{ssec-unichar}, it also includes a patch for
892- control sequences producing Unicode characters.
893-
894-\item[The \emph{otf} package]
895-This package is widely used in \pTeX\ for typesetting characters which is
896-not in JIS~X~0208, and for using more than one weight in \emph{mincho}
897-and \emph{gothic} font families. Therefore \LuaTeX-ja supports features
898-in the \emph{otf} package, by loading \texttt{luatexja-\penalty0otf.sty}
899- manually. Note that characters by |\UTF{xxxx}| and
900- |\CID{xxxx}| are not appended to the current list as a
901- \emph{glyph\_node}, to avoid from callbacks by the
902- \emph{luaotfload} package. We have another remark; |\CID|
903- does not work with TrueType fonts, since |\CID| use the
904- conversion table between CID and the glyph order of the
905- current Japanese font.
906-
907-\item[The \emph{listings} package]
908-It is known for users of \pTeX\ that there is a patch |jlisting.sty| for
909- the \emph{listings} package, to use Japanese characters in
910- the |lstlisting| environment. Generally speaking, it also can
911- be used in \LuaTeX-ja. However, it seems to be that a
912- Japanese character after a space does not recieve any process
913- of the \emph{listings} package; this is inconvinient when we
914- use the \emph{showexpl} package.
915-
916-There is another way to use characters above 256 with the
917- \emph{listings} package (described in\cite{apl}). However,
918- this method is not suitable for Japanese, since the number of
919- Japanese characters is very large. We hope that the
920- \emph{listings} package will be able to handle all characters above
921- 256 without any patch, in the future.
922-
923-
924-\end{description}
925-
926-
927-
928-\section{Implementation}
929-\label{sec:implementation}
930-\subsection{Handling of Japanese fonts}
931-In \pTeX, there are three slots for maintaining current fonts, namely
932-|\font| for alphabetic fonts, |\jfont| for Japanese fonts (in horizontal
933-direction) and |\tfont| for Japanese fonts (in vertical direction). With
934-these slots, we can manage the current font for alphabetic characters
935-and that for Japanese characters separately in \pTeX. However, \LuaTeX\
936-has only one slot for maintaining the current font, as \TeX82. This
937-situation leads a problem: how can we maintain the `current Japanese
938-font'?
939-
940-There are three approaches for this problem. One approach is to make a
941-mapping table from alphabetic fonts to corresponding Japanese fonts
942-(here we don't assume that NFSS2 is available). Another approach is
943-that we always use composite fonts with alphabetic fonts and Japanese
944-fonts. The third approach is that the information of the current
945-Japanese font is stored in an attribute. We adopted the third approach,
946-since \LuaTeX-ja is much affected by \pTeX\ as we noted in
947-Subsection~\ref{ssec-pol}.
948-
949-As in Figure~\ref{fig-jfdef}, \LuaTeX-ja uses |\jfont| for defining
950-Japanese font, as \pTeX. However, because the information of the current
951-Japanese font is stored into an attribute, control sequences defined by
952-|\jfont| (e.g.,~|\foo| and |\bar| in Figure~\ref{fig-jfdef}) is
953-not representing a font by the means of \TeX82. In other words, each of
954-these control sequences is just an assignment to an attribute, therefore
955-they cannot be an argument of |\the|, |\fontname|, nor |\textfont|.
956-
957-
958-Callbacks by the \emph{luaotfload} package, e.g.,~replacement of glyphs
959-according to font features, are executed just after `Examination of
960-Stack Level' (see Subsections \ref{ssec-over}~and~\ref{ssec-stack}). Note that calculation of
961-character classes for each Japanese character is done \emph{after} the
962-these callbacks for now.
963-
964-\subsection{Stack management}
965-\label{ssec-stack}
966-
967-As we noted in Subsection~\ref{ssec-csname}, parameters that the values
968-at the end of a horizontal box or that of a paragraph are valid in
969-whole box or paragraph, such as \emph{kanjiskip}, cannot be implemented
970-by internal integers or registers of other types in \TeX. We explain it
971-in this subsection.
972-
973-\begin{figure}
974-\begin{lstlisting}
975-void package(int c)
976-{
977- ...
978- d = box_max_depth;
979- unsave();
980- save_ptr -= 4;
981- if (cur_list.mode_field == -hmode) {
982- cur_box = filtered_hpack(cur_list.head_field,
983- cur_list.tail_field, saved_value(1),
984- saved_level(1), grp, saved_level(2));
985- subtype(cur_box) = HLIST_SUBTYPE_HBOX;
986- } else {
987-\end{lstlisting}
988-\caption{An extract of a CWEB-source \texttt{tex/packaging.w} of \LuaTeX.}
989-\label{fig-ltsrc}
990-\end{figure}
991-
992-Figure~\ref{fig-ltsrc} is an extract of a CWEB-source
993-\texttt{tex/packaging.w} of \LuaTeX\ (SVN revision 4358). This function
994-is called just when an explicit |\hbox{...}| or |\vbox{...}| is ended, and
995-the function |filtered_hpack()| is where the |hpack_filter| and then the
996-actual `hpack' process are performed. Notice that the |unsave()|
997-function is called before |filtered_hpack()|. This is the problem;
998-because of |unsave()|, we can retrive only the values of registers
999-\emph{outside} the box, even in the |hpack_filter| callback.
1000-
1001-To cope with this problem, \LuaTeX-ja has its own stack system, based on
1002-Lua codes in \cite{stack-mail}. Furthermore, \emph{whatsit} nodes whose
1003-\emph{user\_id} is 30112 (\emph{stack\_node}, for short) will be
1004-appended to the current horizontal list each time the current stack
1005-level is incremented, and their values are the values of
1006-|\currentgrouplevel| at that time. In the beginning of the |hpack_filter|
1007-callback, the list in question is traversed to determine whether the
1008-stack level at the end of the list and that outside the box coincides.
1009-
1010-Let $x$ be the value of |\currentgrouplevel|, and $y$ be the current
1011-stack level, both inside the |hpack_filter| callback, i.e.,~outside a
1012-horizontal box. Consider a list which represents the content of the box,
1013-then we have:
1014-\begin{itemize}
1015-\item A \emph{stack\_node} whose value is $x+1$ (because all materials in
1016- the box are included in a group |\hbox{...}|, the value is at
1017- least $x+1$) in the list represents an assignment related to the
1018- stack system in just top-level of the list, like
1019-\begin{quote}
1020-\begin{verbatim}
1021-\hbox{...(assignment)...}
1022-\end{verbatim}
1023-\end{quote}
1024-In this case, the current stack level is incremented to $y+1$ after the assignment.
1025-\item A \emph{stack\_node} whose value is more than $x+1$ in the list represents
1026-an assignment inside another group contained in the box. For example,
1027- the following input creates
1028-a \emph{stack\_node} whose value is $x+3=(x+1)+2$:
1029-\begin{quote}
1030-\begin{verbatim}
1031-\hbox{...{...{...(assignment)}...}...}
1032-\end{verbatim}
1033-\end{quote}
1034-\end{itemize}
1035-Thus, we can conclude that the stack level at the end of the list is
1036-$y+1$, if and only if there is a \emph{stack\_node} whose value is
1037-$x+1$. Otherwise, the stack level is just $y$.
1038-
1039-\subsection{Adjustment of the position of Japanese characters}
1040-\label{ssec-width}
1041-
1042-The size of a glyph specified in a metric and that of a real font
1043-usually differ. For example, the letter `\inhibitglue【' is half-width
1044-in |jfm-ujis.lua| or |jis.tfm|, while this letter is full-width like `【'
1045-in most TrueType fonts used in Japanese typesetting, such as
1046-IPA~Mincho. Hence the adjustment of position of such glyphs is
1047-needed. In the context of \pTeX, this process was performed using virtual fonts.
1048-
1049-On the other hand, Lua\TeX-ja does the adjustment by encapsuling a glyph
1050-into a horizontal box. There are two main reasons why we adopted this
1051-method; one is that we feared Lua codes for coexisting with callbacks by
1052-the |luaotfload| package would be large if we use virtual fonts, and the
1053-other is to cope with shifting of the baseline of characters at the
1054-same time.
1055-
1056-\begin{figure}
1057-\begin{center}\unitlength=9pt\small
1058-\begin{picture}(15,12)(-1,-3)
1059-
1060-\color{grayx}% real glyph
1061-\put(-1,-1.5){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
1062-
1063-\color{black}% real glyph :step1
1064-\thicklines
1065-\put(-1,-1.5){\line(0,1){7}\line(0,-1){2.5}}
1066-\put(5,-1.5){\line(0,1){7}\line(0,-1){2.5}}
1067-\put(-1,5.5){\line(1,0){6}}
1068-\put(-1,-4){\line(1,0){6}}
1069-\put(-1,0){\makebox(0,0)[r]{\strut$R$\,}}
1070-
1071-\thicklines
1072-\put(0,0){\vector(0,1){9}\line(0,-1){3}\vector(1,0){12}}
1073-\put(12,9){\makebox(0,0)[rt]{\strut$M$\,}}
1074-\put(12,0){\line(0,1){9}\vector(0,-1){3}}
1075-\put(0,9){\line(1,0){12}}
1076-\put(0,-3){\line(1,0){12}}
1077-\put(0.2,4.5){\makebox(0,0)[l]{\texttt{height}}}
1078-\put(12.2,-1.5){\makebox(0,0)[l]{\texttt{depth}}}
1079-\put(6,0.2){\makebox(0,0)[b]{\texttt{width}}}
1080-
1081-\thicklines
1082-\put(3,0){\line(0,1){7}\line(0,-1){2.5}\line(1,0){6}}
1083-\put(9,0){\line(0,1){7}\line(0,-1){2.5}}
1084-\put(3,7){\line(1,0){6}}
1085-\put(3,-2.5){\line(1,0){6}}
1086-\newsavebox{\eqdist}
1087-\savebox{\eqdist}(0,0)[c]{%
1088- \thinlines
1089- \put(-0.08,0.2){\line(0,-1){0.4}}%
1090- \put(0.08,0.2){\line(0,-1){0.4}}}
1091-\put(1.5,0){\usebox{\eqdist}}
1092-\put(10.5,0){\usebox{\eqdist}}
1093-
1094-\thicklines
1095-\put(3,-1.5){\vector(-1,0){4}}
1096-\put(1,-1.7){\makebox(0,0)[t]{\texttt{left}}}
1097-\put(3,0){\vector(0,-1){1.5}}
1098-\put(3.2,-0.75){\makebox(0,0)[l]{\texttt{down}}}
1099-\end{picture}
1100-\end{center}
1101-\caption{The position of the `real' glyph.}
1102-\label{fig-pos}
1103-\end{figure}
1104-
1105-Figure~\ref{fig-pos} shows the adjustment process. A large square $M$ is
1106-the imaginary body specified in the metric, and a vertical
1107-rectangle is the imaginary body of a real glyph. First, the real glyph
1108-is aligned with respect to the width of $M$. In the figure, the real
1109-glyph is aligned `middle'; this setting is useful for the full-width
1110-middle dot `・'. We have other settings, `left' and `right'.
1111-After that, it is shifted according to the value of |left| and |down|,
1112-which are specified in the metric, too. The final position of the real glyph
1113-is shown by the gray rectangle~$R$. If the amount of shifting the baseline is
1114-not zero, $M$ (and hence the real glyph) is shifted by that amount.
1115-
1116-We would like to remark briefly on the vertical position of a real
1117-glyph. A JFM (or a metric used in \LuaTeX-ja) and a real font used for
1118-it may have different height or depth. In that case, it may look better
1119-if the real glyph is shifted vertically to match the height-depth ratio
1120-specified in the metric, while any vertical adjustment except the
1121-adjustment by the |down| value does not performed in the present
1122-implementation of \LuaTeX-ja . This situation is carefully studied by
1123-Otobe~\cite{min10}. Here the policy on this problem is not determined
1124-now, however we would like to offer several solutions in future
1125-development.
1126-
1127-\section{Conclusion}
1128-We have discussed about our \LuaTeX-ja package, which is much affected
1129-by \pTeX. For now, it can be used for experimental use, however there
1130-are much refinements which are needed for regular use. The author hopes
1131-that this paper and \LuaTeX-ja project contribute the typesetting Japanese,
1132-and possibly other Asian languages, under \LuaTeX.
1133-
1134-\section*{Acknowledgements}
1135-The author would like to thank Ken Nakano and Hideaki Togashi for their
1136-development of ASCII \pTeX. The author is very grateful to Haruhiko
1137-Okumura for his leadership in the Japanese \TeX\ community. The author
1138-is also very grateful to members of \LuaTeX-ja project team for their
1139-valuable cooperation in development.
1140-
1141-%%% The style of the bibiliogrphy is `amsplain'.
1142-\providecommand{\bysame}{\leavevmode\hbox to3em{\hrulefill}\thinspace}
1143-\providecommand{\href}[2]{#2}
1144-\begin{thebibliography}{99}
1145-
1146-\bibitem{aj16}
1147-Adobe Systems Incorporated, \emph{Adobe-Japan1-6 Character Collection
1148- for CID-Keyed Fonts}, Technical Note~\#5078, 2004.
1149-\url{http://partners.adobe.com/public/developer/en/font/5078.Adobe-Japan1-6.pdf}
1150-
1151-\bibitem{ptex}
1152-ASCII MEDIA WORKS,アスキー日本語\TeX\ (\pTeX).\url{http://ascii.asciimw.jp/pb/ptex/}
1153-
1154-\bibitem{apl}
1155-John Baker, \emph{Typesetting UTF8 APL code with the \LaTeX\ lstlisting package}.
1156-\url{http://bakerjd99.wordpress.com/2011/08/15/}
1157-
1158-\bibitem{omega}
1159-Jin-Hwan~Cho and Haruhiko Okumura, \emph{Typesetting CJK Languages with Omega},
1160-\TeX, XML, and Digital Typography, Lecture Notes in Computer Science, vol.~3130,
1161-Springer, 2004, 139--148.
1162-
1163-\bibitem{joylua}
1164-Yannis Haralambous. \emph{The Joy of \LuaTeX}. \url{http://luatex.bluwiki.com/}
1165-
1166-\bibitem{jisx4051}
1167-Japanese Industrial Standards Committee. \emph{JIS~X~4051: Formatting
1168- rules for Japanese documents}, 1993, 1995, 2004.
1169-
1170-\bibitem{eptex}
1171-北川弘典,$\varepsilon$-\pTeX についてのwiki.
1172-\url{http://sourceforge.jp/projects/eptex/wiki/FrontPage}
1173-
1174-\bibitem{luaums}
1175-北川弘典,\LuaTeX で日本語.
1176-\url{http://oku.edu.mie-u.ac.jp/tex/mod/forum/discuss.php?d=378}
1177-
1178-\bibitem{luatexref}
1179-\LuaTeX\ development team, \emph{The \LuaTeX\ reference}.
1180-\url{http://www.luatex.org/svn/trunk/manual/luatexref-t.pdf} (snapshot of SVN trunk)
1181-
1182-\bibitem{man}
1183-\LuaTeX-ja project team, \emph{The \LuaTeX-ja package}.
1184-Not completed for now. Available at |doc/man-en.pdf| (in English) or
1185- |doc/man-ja.pdf| (in Japanese)
1186-in the Git repository.
1187-
1188-\bibitem{luajp-test}
1189-香田温人,\LuaTeX と日本語.
1190-\url{http://www1.pm.tokushima-u.ac.jp/~kohda/tex/luatex-old.html}
1191-
1192-\bibitem{luajalayout}
1193-前田一貴,luajalayout パッケージ---Lua\LaTeX によ
1194- る日本語組版---.
1195-\url{http://www-is.amp.i.kyoto-u.ac.jp/lab/kmaeda/lualatex/luajalayout/}
1196-
1197-\bibitem{jsclasses}
1198-奥村晴彦,p\LaTeXe 新ドキュメントクラス.
1199-\url{http://oku.edu.mie-u.ac.jp/~okumura/jsclasses/}
1200-
1201-\bibitem{ptexjp}
1202-Haruhiko Okumura, \emph{\pTeX\ and Japanese Typesetting},
1203- The Asian Journal of \TeX\ \textbf{2}~(2008), 43--51.
1204-
1205-\bibitem{min10}
1206-乙部厳己,min10フォントについて.
1207-\url{http://argent.shinshu-u.ac.jp/~otobe/tex/files/min10.pdf}
1208-
1209-\bibitem{otf}
1210-齋藤修三郎,Open Type Font用VF.
1211-\url{http://psitau.kitunebi.com/otf.html}
1212-
1213-\bibitem{stack-mail}
1214-Jonathan Sauer, \emph{[Dev-luatex] tex.currentgrouplevel}.
1215-\url{http://www.ntg.nl/pipermail/dev-luatex/2008-August/001765.html}
1216-
1217-\bibitem{uptex}
1218-Takuji Tanaka, \emph{u\pTeX, up\LaTeX---unicode version of \pTeX, p\LaTeX}.
1219-\url{http://homepage3.nifty.com/ttk/comp/tex/uptex_en.html}
1220-
1221-\bibitem{ptexenc}
1222-Nobuyuki Tsuchimura, \emph{Development of a Japanese \TeX\ Distribution~`ptetex3'},
1223-Computer Software\ \textbf{24} (2007), no.~4, 40--50, (in Japanese).
1224-
1225-\bibitem{w3c}
1226-W3C Working Group, \emph{Requirements for Japanese Text Layout}.
1227-\url{http://www.w3.org/TR/jlreq/}
1228-\end{thebibliography}
1229-
1230-\end{document}
1+%#!lualatex ajt-devel-ltja
2+\documentclass{ajt}
3+
4+%%% Packages used in this paper
5+
6+%%% Font setting for \LuaTeX; this is extract from ajt.cls
7+\makeatletter
8+ \if@print
9+ \RequirePackage{fontspec,xunicode}
10+ \RequirePackage{luatextra}
11+ \setmainfont[Mapping=tex-text]{Palatino LT Std}
12+ \setsansfont[Mapping=tex-text]{Optima LT Std}
13+ \else
14+ \RequirePackage{fontspec,luatextra}
15+ \setmainfont[Mapping=tex-text]{TeX Gyre Pagella} % \simeq Palatino
16+ \fi
17+
18+%%% LuaTeX-ja
19+\usepackage{luatexja,luatexja-fontspec}
20+\ltjsetparameter{jacharrange={-3,-8}}
21+\DeclareFontShape{JY3}{mc}{m}{n}{<-> s*[0.92489] file:ipam.ttf:jfm=ujis}{}
22+\DeclareFontShape{JY3}{gt}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=ujis}{}
23+% quick hack: monospaced Japanese font by \ttfamily
24+\DeclareKanjiFamily{JY3}{\ttdefault}{}{}
25+\DeclareFontShape{JY3}{\ttdefault}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=mono}{}
26+
27+
28+%%% LTXexample environment
29+\usepackage{showexpl,lltjlisting}
30+\lstset{basicstyle=\ttfamily\small, width=0.3\textwidth, basewidth=.5em}
31+
32+%%% Verbatim environment
33+\usepackage{fancyvrb}
34+\CustomVerbatimEnvironment{code}{Verbatim}%
35+{numbers=left,xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small}
36+\CustomVerbatimEnvironment{codewithoutnum}{Verbatim}%
37+{xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small}
38+\CustomVerbatimEnvironment{codewithoutnumsmall}{Verbatim}%
39+{xleftmargin=1.5em,baselinestretch=1.0,fontsize=\footnotesize}
40+\DefineShortVerb{\|}
41+
42+%%% Others
43+\usepackage{mflogo,booktabs}
44+\definecolor{grayx}{gray}{0.85}
45+\hyphenation{
46+ kanjiskip
47+ xkanjiskip
48+}
49+
50+%%% Mandatory article metadata %%%
51+\title{Development of \LuaTeX-ja package}
52+\author{Hironori Kitagawa {\normalsize 北川 弘典}}
53+\address{\LuaTeX-ja project team}
54+\email{h\_kitagawa2001@yahoo.co.jp}
55+
56+\keywords{\TeX, p\TeX, \LuaTeX, \LuaTeX-ja, Japanese}
57+\abstract{%
58+\LuaTeX-ja package is a macro package for typesetting Japanese
59+documents under \LuaTeX. The package has more flexibility of
60+typesetting than \pTeX, which is widely used Japanese extension of \TeX,
61+and has corrected some unwanted features of \pTeX.
62+In this paper, we describe specifications, the current status and some
63+internal processing methods of \LuaTeX-ja.
64+}
65+
66+\newcommand{\parname}[1]{\textsf{#1}}
67+\newcommand{\jstrut}{\vrule width0pt height\cht depth\cdp}
68+\newcommand{\imagfm}[1]{\ifvmode\leavevmode\fi%
69+ \hbox{\fboxsep=0pt\fbox{\setbox0=\hbox{#1}\copy0\kern-\wd0
70+ \smash{\vrule width \wd0 height 0.4pt depth0.4pt}}}}
71+\begin{document}
72+
73+%%% Do not forget to start with \maketitle!
74+\maketitle
75+
76+\section{Introduction}
77+\subsection{History}
78+To typeset Japanese documents with \TeX, ASCII \pTeX~\cite{ptex} has
79+been widely used in Japan. There are other methods---for example, using
80+Omega and OTP~\cite{omega}, or with the CJK package---to do so, however,
81+these alternative methods did not become majority. The author thinks
82+that this is because \pTeX\ enables us to produce high-quality documents
83+(e.g.,~supporting vertical typesetting), and the appearance of \pTeX\ is
84+earlier than that of alternatives described above.
85+
86+However, \pTeX\ has been left behind from the extensions of \TeX\ such
87+as \eTeX\ and \pdfTeX, and the diffusion of UTF-8 encoding. In recent
88+years, the situation has become better, by development of
89+|ptexenc|~\cite{ptexenc} by Nobuyuki Tsuchimura (\hbox{土村展之}),
90+$\varepsilon$-\pTeX~\cite{eptex} by the author,~and u\pTeX~\cite{uptex}
91+by Takuji Tanaka (田中琢爾). However, continuing this approach, namely,
92+to develop an engine extension localized for Japanese, is not wise. This
93+approach needs lots of work for \emph{each} engine. In addition, if we
94+use \LuaTeX, the necessity of an engine extension is getting smaller
95+because \LuaTeX\ has an ability to hook \TeX's internal process by using
96+Lua callbacks.
97+
98+
99+There were several experimental attempts to typeset
100+Japanese documents with \LuaTeX\ before. Here we cite three examples:
101+\begin{itemize}
102+\item |luaums.sty|~\cite{luaums} developed by the author. This
103+ experimental package is for creating a certain Japanese-based presentation
104+ with \LuaTeX.
105+\item the \emph{luajalayout} package~\cite{luajalayout}, formerly known as the
106+ \emph{jafontspec} package, by Kazuki Maeda (前田一貴). This package is based on
107+ \LaTeXe\ and \emph{fontspec} package.
108+\item the \emph{luajp-test} package~\cite{luajp-test}, a test package made by
109+ Atsuhito Kohda (香田温人), based on articles on the web page~\cite{joylua}.
110+\end{itemize}
111+However, these packages are based on \LaTeXe, and do not have much
112+ability to control the typesetting rule. And it is inefficient that more
113+than one people separately develop similar packages. Development of the
114+\LuaTeX-ja package is started initially by the author and Kazuki Maeda, because of
115+these situations.
116+
117+\subsection{Development policy of \LuaTeX-ja}
118+\label{ssec-pol}
119+The first aim of \LuaTeX-ja project was to implement features (from the
120+`primitive' level) of \pTeX\ as macros under \LuaTeX, therefore \LuaTeX-ja is
121+much affected by \pTeX. However, as development proceeded, some
122+technical/conceptual difficulties arose. Hence we changed the aim
123+of the project as follows:
124+\begin{itemize}
125+\item\emph{\LuaTeX-ja offers at least the same flexibility of
126+ typesetting that p\TeX\ has.}
127+
128+ We are not satisfied with the ability of producing outputs conformed to
129+ JIS~X~4051~\cite{jisx4051}, the Japanese Industrial Standard for
130+ typesetting, or to a technical note~\cite{w3c} by W3C;
131+ if one wants to produce very incoherent outputs for some reason, it
132+ should be possible.
133+In this point, previous attempts of Japanese typesetting with \LuaTeX\
134+ which we cited in the previous subsection are inadequate.
135+
136+\pTeX\ has some flexibility of typesetting, by changing internal
137+ parameters such as |\kanjiskip| or |\prebreakpenalty|, and by using
138+ custom JFM (Japanese TFM). Therefore we decided to include these
139+ functionality to \LuaTeX-ja.
140+
141+\item\emph{\LuaTeX-ja isn't mere re-implementation or porting of \pTeX;
142+ some (technically and/or conceptually) inconvenient features of
143+ \pTeX\ are modified.}
144+
145+ We describe this point in more detail at the next section.
146+\end{itemize}
147+
148+
149+\subsection{Overview of the processes}
150+\label{ssec-over}
151+We describe an outline of \LuaTeX-ja's process in order.
152+
153+\begin{itemize}
154+\item In the |process_input_buffer| callback: treatment of breaking
155+ lines after a Japanese character (in Subsection~\ref{ssec-line}).
156+
157+\item In the |hyphenate| callback: font replacement.
158+
159+\LuaTeX-ja looks into for each \textit{glyph\_node}~$p$ in the horizontal list. If
160+ the character represented by $p$ is considered as a Japanese
161+ character, the font used at $p$ is replaced by the value of
162+ |\ltj@curjfnt|, an attribute for `the current Japanese font'
163+ at~$p$.
164+
165+Furthermore, the subtype of $p$ is subtracted by 1 to suppress
166+ hyphenation around $p$ by \LuaTeX, because later processes of
167+ \LuaTeX-ja take care of all things about Japanese characters.
168+
169+\item In |pre_linebreak_filter| and |hpack_filter| callbacks:
170+
171+\begin{enumerate}
172+\item \LuaTeX-ja has its own stack system, and the current horizontal
173+ list is traversed in this stage to determine what the level of
174+ \LuaTeX-ja's internal stack at the end of the list is. We will
175+ discuss it in Subsection~\ref{ssec-stack}.
176+
177+\item In this stage, \LuaTeX-ja inserts glues/kerns for Japanese
178+ typesetting in the list. This is the core routine of \LuaTeX-ja.
179+ We will discuss it in Subsections
180+ \ref{ssec-jglue}~and~\ref{ssec-jspec} .
181+
182+\item To make a match between a metric and a real font, sometimes
183+ adjustument of the position of (Japanese) glyphs are performed.
184+ We will discuss it in Subsection~\ref{ssec-width}.
185+\end{enumerate}
186+\item In the |mlist_to_hlist| callback: treatment of Japanese characters
187+ in math formulas. This stage is similar to adjustment of the
188+ position of glyphs (see above), so we omit to describe this stage
189+ from this paper.
190+\end{itemize}
191+
192+In this paper, a \emph{alphabetic character} means a non-Japanese
193+character. Similarly, we use the word an \emph{alphabetic font} as the
194+counterpart of a jJpanese font.
195+
196+\subsection{Contents of this paper}
197+Here we describe the contents of the rest of this paper briefly. In
198+Section~\ref{sec:differences_with_ptex}, we describe major differences
199+between \pTeX\ and \LuaTeX-ja. The next section,
200+Section~\ref{sec:distinction_of_characters}, is concentrated on a
201+problem how we distinguish between Japanese characters and alphabetic
202+characters. In Section~\ref{sec:current_status}, we show current
203+development status of the package. Finally, in
204+Section~\ref{sec:implementation}, we describe some internal routines of
205+\LuaTeX-ja.
206+
207+\subsection{General information of the project}
208+This \LuaTeX-ja project is hosted by SourceForge.jp. The official wiki
209+is located on
210+\url{http://sourceforge.jp/projects/luatex-ja/wiki/}. There is
211+no stable version on October 22, 2011, however a set of developer sources can be
212+obtained from the git repository. Members of the project team are as follows
213+(in random order): Hironori Kitagawa, Kazuki Maeda, Takayuki Yato,
214+Yusuke Kuroki, Noriyuki Abe, Munehiro Yamamoto, Tomoaki Honda,
215+and~Shuzaburo Saito.
216+
217+
218+\section{Major differences with \pTeX}
219+\label{sec:differences_with_ptex}
220+In this section, we explain several major differences between \pTeX\
221+and our \LuaTeX-ja. For general information of Japanese typesetting and the
222+overview of \pTeX, please see Okumura~\cite{ptexjp}.
223+
224+
225+\subsection{Names of control sequences}
226+\label{ssec-csname} Because \pTeX\ is an engine modification of Knuth's
227+original \TeX82 engine, some of the additional primitives take a form that is
228+very difficult to be simulated by a macro. For example, an additional
229+primitive |\prebreakpenalty|$\langle\hbox{\it
230+char\_code}\rangle$|[=]|$\langle\hbox{\it penalty}\rangle$ in \pTeX\
231+sets the amount of penalty inserted before a character whose code is
232+$\langle\hbox{\it char\_code}\rangle$ to $\langle\hbox{\it
233+penalty}\rangle$, and this form |\prebreakpenalty|$\langle\hbox{\it
234+char\_code}\rangle$ can be also used for retrieving the value.
235+
236+Moreover, there are some internal parameters of \pTeX\ which values of them at the end of a
237+horizontal box or that of a paragraph are valid in whole box or
238+paragraph. However, the implementation of these parameters in
239+\LuaTeX-ja is not so easy; we will discuss it in Subsection~\ref{ssec-stack}.
240+
241+From above two problems discussed above, the assignment and retrieval
242+of most parameters in \LuaTeX-ja are summarized into the following
243+three control sequences:
244+\begin{itemize}
245+\item |\ltjsetparameter{|$\langle\hbox{\it
246+ name}\rangle$|=|$\langle\hbox{\it value}\rangle$|,...}|: for local
247+ assignment.
248+\item |\ltjglobalsetparameter|: for global assignment. Note that these two control
249+ sequences obey the value of |\globaldefs| primitive.
250+\item |\ltjgetparameter{|$\langle\hbox{\it
251+ name}\rangle$|}[{|$\langle\hbox{\it optional
252+ argument}\rangle$|}]|: for retrieval. The returned value is always
253+ a string.
254+\end{itemize}
255+
256+\subsection{Line-break after a Japanese character}
257+\label{ssec-line}
258+
259+Japanese texts can break lines almost everywhere, in contrast with
260+alphabetic texts can break lines only between words (or use
261+hyphenation). Hence, \pTeX's input processor is modified so that a
262+line-break after a Japanese character doesn't emit a space. However,
263+there is no way to customize the input processor of \LuaTeX, other than
264+to hack its CWEB-source. All a macro package can do is to modify an input line before
265+when \LuaTeX\ begin to process it, inside the |process_input_buffer|
266+callback.
267+
268+Hence, in \LuaTeX-ja, a comment letter (we reserve U+FFFFF for this
269+purpose) will be appended to an input line, if this line ends with a Japanese
270+character.\footnote{Strictly speaking, it also requires that the catcode
271+of the end-line character is 5~(\emph{end-of-line}). This condition is
272+useful under the verbatim environment.} One might jump to a conclusion
273+that the treatment of a line-break by \pTeX\ and that of \LuaTeX-ja are
274+totally same, however they are different in the respect that \LuaTeX-ja's
275+judgement whether a comment letter will be appended the line is done
276+\emph{before} the line is actually processed by \LuaTeX.
277+
278+Figure~\ref{fig-linebreak} shows an example of this situation; the
279+command at the first line marks most of Japanese characters as
280+`non-Japanese characters'. In other words, from that command onward, the
281+letter `あ' will be treated as an alphabetic character by
282+\LuaTeX-ja. Then, it is natural to have a space between `あ' and `y' in
283+the output, where the actual output in the figure does not so. This is
284+because `あ' is considered a Japanese character by \LuaTeX-ja,
285+when \LuaTeX-ja does the decision whether U+FFFFF will be added to the
286+input line~2.
287+
288+\begin{figure}
289+\begin{LTXexample}
290+\font\x=IPAMincho \x
291+\ltjsetparameter{jacharrange={-6}}xあ
292+y
293+\end{LTXexample}
294+\caption{A notable sample showing the treatment of a line-break after a
295+Japanese character.}\label{fig-linebreak}
296+\end{figure}
297+
298+\subsection{Separation between `real' fonts and metrics}
299+\label{ssec-sepmet}
300+
301+Traditionally, most Japanese fonts used in typesetting are not
302+proportional, that is, most glyphs have same size (in most cases,
303+square-shaped). Hence, it is not rare that the contents of different
304+JFMs are essentially same, and only differ in their names. For example,
305+|min10.tfm| and |goth10.tfm|, which are JFMs shipped with \pTeX\ for
306+seriffed \emph{mincho} family and sans-seriffed \emph{gothic} family,
307+differ their |FAMILY| and |FACE| only. Moreover, |jis.tfm| and
308+|jisg.tfm|, which is included in the \emph{jis} font metric, which is
309+used in \emph{jsclasses}~\cite{jsclasses} by Haruhiko Okumura (奥村晴彦),
310+are totally same as binary files. Considering this situation, we
311+decided to separate `real' fonts and metrics used for them in
312+\LuaTeX-ja. Typical declarations of Japanese fonts in the style of plain
313+\TeX\ are shown in Figure~\ref{fig-jfdef}. We would like to add several
314+remarks:
315+\begin{itemize}
316+\item A control sequence |\jfont| must be used for Japanese fonts, instead of |\font|.
317+\item \LuaTeX-ja automatically loads the \emph{luaotfload} package, so
318+ \hbox{\tt file:} and \hbox{\tt name:} prefixes, and various font features can be
319+ used as the first line in Figure~\ref{fig-jfdef}.
320+\item The |jfm| key specifies the metric for the font. In
321+ Figure~\ref{fig-jfdef}, both fonts will use a metric stored in a
322+ Lua script named |jfm-ujis.lua|. This metric is the standard
323+ metric in \LuaTeX-ja, and is based on JFMs used in the \emph{otf}
324+ package~\cite{otf}.
325+\item The \hbox{psft:} prefix can be used to specify name-only, non-embedded
326+ fonts. When one displays a pdf with these fonts, actual fonts which
327+ will be used for them depend on a pdf reader.
328+\end{itemize}
329+The specification of a metric for \LuaTeX-ja is similar to that of a JFM
330+(see \cite{ptexjp}); characters are grouped into several classes, the
331+size information of characters are specified for each class, and
332+glue/kern insertions are specified for each pair of classes. Although
333+the author have not tried, it may be possible to develop a program that
334+`converts' a JFM to a metric for \LuaTeX-ja. \LuaTeX-ja offers three
335+metrics by default; |jfm-ujis.lua|, |jfm-jis.lua| based on the
336+\emph{jis} font metric, and |jfm-min.lua| based on old |min10.tfm|.
337+
338+ Note that |-kern| in features
339+is important, because kerning information from a real font itself will
340+clash with glue/kern information from the metric.
341+
342+\begin{figure}
343+\begin{verbatim}
344+\jfont\foo=file:ipam.ttf:jfm=ujis;script=latn;-kern;+jp04 at 12pt
345+\jfont\bar=psft:Ryumin-Light:jfm=ujis at 10pt
346+\end{verbatim}
347+\caption{Typical declarations of Japanese fonts.}
348+\label{fig-jfdef}
349+\end{figure}
350+
351+\subsection{Insertion of glues/kerns for Japanese typesetting: timing}
352+\label{ssec-jglue}
353+
354+As described in \cite{luatexref}, \LuaTeX's kerning and ligaturing
355+processes are totally different from those of \TeX82. \TeX82's process is
356+done just when a (sequence of) character is appended to the current
357+list. Thus we can interrupt this process by writing as
358+|f{}irm|. However, \LuaTeX's process is \emph{node-based}, that is, the
359+process will be done when a horizontal box or a paragraph is ended, so
360+|f{}irm| and |firm| yield same outputs under \LuaTeX.
361+
362+The situation for Japanese characters is more complicated.
363+Glues (and kerns) which are needed for Japanese
364+typesetting are divided into the following three categories:
365+\begin{itemize}
366+\item Glue (or kern) from the metric of Japanese fonts (\emph{JFM glue},
367+ for short).
368+
369+\item Default glue between a Japanese character and an alphabetic
370+ character (\emph{xkanjiskip}, for short), usually 1/4 of
371+ full-width (\emph{shibuaki}) with some stretch and shrink for
372+ justifying each line.
373+\item Default glue between two consecutive Japanese characters
374+ (\emph{kanjiskip}, for short). The main reason of this glue is to
375+ enable breaking lines almost everywhere in Japanese texts. In most
376+ cases, its natural width is zero, and some stretch/shrink for
377+ justifying each line.
378+\end{itemize}
379+In \pTeX, these three kinds of glues are treated differently. A JFM glue
380+is inserted when a (sequence of) Japanese character is appended to the
381+current list, same as the case of alphabetic characters in \TeX82. This
382+means that one can interrupt the insertion process by saying |{}|. A
383+\emph{xkanjiskip} is inserted just before `hpack' or line-breaking of a
384+paragraph; this timing is somewhat similar to that of \LuaTeX's kerning
385+process. Finally, A \emph{kanjiskip} is not appeared as a node anywhere;
386+only appears implicitly in calculation of the width of a horizontal box,
387+that of breaking lines, and the actual output process to a DVI
388+file. These specifications have made \pTeX's behavior very hard to
389+understand.
390+
391+\LuaTeX-ja inserts glues in all three categories simultaneously inside
392+|hpack_filter| and |pre_linebreak_filter| callbacks. The reasons of
393+this specification are to behave like alphabetic characters in \LuaTeX\
394+(as described in the first paragraph in this subsection), and to clarify the specification
395+for \LuaTeX-ja's process.
396+
397+\subsection{Insertion of glues/kerns for Japanese typesetting: specification}
398+\label{ssec-jspec}
399+
400+\begin{table}
401+\caption{Examples of differences between \pTeX\ and \LuaTeX-ja.}
402+\label{tab-jfmglue}
403+\begin{center}
404+\begin{tabular}{llllllll}
405+\toprule
406+&\multicolumn{1}{c}{(1)}&\multicolumn{1}{c}{(2)}&\multicolumn{1}{c}{(3)}&\multicolumn{1}{c}{(4)}\\
407+Input &|あ】{}【〙\/〘| &|い』\/a| &|う)\hbox{}(| &|え]\special{}[|\\\midrule
408+\pTeX &あ】\hbox{}【〙\hbox{}〘&い』\/a &う)\hbox{}( &え]\hbox{}[\\
409+\LuaTeX-ja &あ】{}【〙\/〘 &い』\/a &う)\hbox{}( &え]\special{}[\\
410+\bottomrule
411+\end{tabular}
412+\end{center}
413+\end{table}
414+
415+\begin{figure}
416+\begin{center}
417+\fontsize{40}{40}\selectfont
418+\imagfm{\jstrut あ}%
419+\imagfm{\jstrut 】\inhibitglue}%
420+\imagfm{\jstrut\kern.5\zw}%
421+\imagfm{\jstrut\kern.5\zw}%
422+\imagfm{\jstrut\inhibitglue【}%
423+\imagfm{\jstrut 〙\inhibitglue}%
424+\imagfm{\jstrut\kern.5\zw}%
425+\imagfm{\jstrut\kern.5\zw}%
426+\imagfm{\jstrut\inhibitglue〘}%
427+\end{center}
428+\caption{Detail of (1) in Table~\ref{tab-jfmglue}.}
429+% \caption{Details of the output of \pTeX in the case of (1) in Table~\ref{tab-jfmglue}.} のほうがよい?
430+\label{fig-ptexjfm}
431+\end{figure}
432+
433+Now we will take a look at the insertion process itself through four points.
434+
435+\begin{description}
436+\item[Ignored Nodes]
437+As noted in the previous subsection, the insertion process in \pTeX\ can
438+ be interrupted by saying |{}| or anything else.\footnote{This
439+ is why some tricks like \texttt{ちょ\char`\{\char`\}っと} for
440+ \texttt{min10.tfm} and other `old' JFMs work.} This leads the
441+ second row in Table~\ref{tab-jfmglue}, or
442+ Figure~\ref{fig-ptexjfm}. Here `the process is interrupted'
443+ means that \pTeX\ does not think the letter `】\inhibitglue'
444+ is followed by `\inhibitglue【', hence two half-width glues
445+ are inserted between `】\inhibitglue' and `\inhibitglue【',
446+ where the left one is from `】\inhibitglue' and the right one
447+ is from `\inhibitglue【'.
448+
449+ On the other hand, in \LuaTeX-ja, the process is done inside
450+ |hpack_filter| and |pre_linebreak_filter| callbacks. Hence,
451+ \emph{anything that does not make any node will be
452+ ignored}\ in \LuaTeX-ja, as shown in (1) in
453+ Table~\ref{tab-jfmglue}. \LuaTeX-ja also ignores any nodes
454+ which does not make any contribution to current horizontal
455+ list---\emph{ins\_node}, \emph{adjust\_node},
456+ \emph{mark\_node}, \emph{whatsit\_node} and
457+ \emph{penalty\_node}---, as shown in (4).
458+
459+
460+By the way, around a \emph{glyph\_node} $p$ there may be some nodes
461+ attached to~$p$. These are an accent and kerns for
462+ moving it to the right place, and a kern from the italic
463+ correction\footnote{\TeX82 (and \LuaTeX) does not distinguish
464+ between explicit kern and a kern for italic correction. To
465+ distinguish them, an additional subtype for a kern is introduced
466+ in \pTeX. On the other hand, \LuaTeX-ja uses an additional attribute and
467+ redefines \texttt{\char`\\/} to set this attribute.} for $p$. It is natural that
468+ these attachments should be ignored inside the process. Hence
469+ \LuaTeX-ja takes this approach, as the latest version of
470+ \pTeX\ (p3.2). This explains (2) in the figure.
471+ % - p3.2 というのは pTeX のバージョン? (version~p3.2) くらいに補ったほうがよいかも.
472+ % - in the figure は指している場所がわからずつらいだろう.面倒でも in Table~\reftab-jfmglue{} のほうがふさわしいだろう.
473+
474+Summerizing above, one should put an empty horizontal box |\hbox{}| to
475+ where he/she wants to interrupt the insertion process in
476+ \LuaTeX-ja as (3) in the figure.
477+ % - in the figure は指している場所がわからずつらいだろう.面倒でも in Table~\reftab-jfmglue{} のほうがふさわしいだろう.
478+
479+\item[Fonts with the Same Metric]
480+Recall that \LuaTeX-ja separates `real' fonts and metrics, as in Subsection~\ref{ssec-sepmet}.
481+Consider the following input, where all Japanese fonts use same metric
482+ (in \LuaTeX-ja), and |\gt| selects \emph{gothic} family for
483+ the current Japanese font family:
484+\begin{quote}
485+\begin{verbatim}
486+明朝)\gt (ゴシック
487+\end{verbatim}
488+\end{quote}
489+If the above input is processed by \pTeX, because the insertion process is
490+ interrupt by |\gt|, the result looks like
491+\begin{quote}
492+\mc 明朝)\hbox{}\gt (ゴシック
493+\end{quote}
494+However this seems to be unnatural, since two Japanese fonts in the
495+ output use the same metric, i.e.,~the same
496+ typesetting rule. Hence, we decided that Japanese fonts with
497+ the same metric are treated as one font in the insertion
498+ process of \LuaTeX-ja. Thus, the output from the above input
499+ in \LuaTeX-ja looks like:
500+\begin{quote}
501+\mc 明朝)\gt (ゴシック
502+\end{quote}
503+One might have the situation that this default behavior is not
504+ suitable. \LuaTeX-ja offers a way to handle this situation, but
505+ we leave it to the manual~\cite{man}.
506+
507+\item[Fonts with Different Metrics]
508+The case where two consecutive Japanese characters use different metrics and/or
509+ different size is similar. Consider the following input where
510+ the \emph{mincho} family and the \emph{gothic} family use
511+ different metrics:
512+\begin{quote}
513+\begin{verbatim}
514+漢)\gt (漢)\large (大
515+\end{verbatim}
516+\end{quote}
517+As the previous paragraph, this input yields the following, by \pTeX:
518+\begin{quote}
519+\mc 漢)\hbox{}\gt (漢)\hbox{}\large (大
520+\end{quote}
521+We had thought that amounts of spaces between parentheses in above output
522+ are too much. Hence we have changed the default behavior of
523+ \LuaTeX-ja, so that the amount of a glue between two Japanese
524+ characters with different metrics is the \emph{average} of a glue
525+ from the left character and that from the right
526+ character. For example, Figure~\ref{fig-diffmet} shows the
527+ output from above input. The width of glue indicated `(1)' is
528+ $(a/2 + a/2)/2 = 0.5a$, and the width of glue indicated `(2)'
529+ is $(a/2 + 1.2a/2)/2 = 0.55a$. This default behavior can be
530+ changed by \textsf{diffrentmet} parameter of \LuaTeX-ja.
531+
532+\begin{figure}
533+\begin{center}
534+\fontsize{40}{40}\selectfont
535+\imagfm{\jstrut\smash{%
536+ \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr漢\cr
537+ \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$a$}\
538+ \hrulefill\vrule height .5ex depth .5ex\cr}}}}%
539+\imagfm{\jstrut )\inhibitglue}%
540+\hbox to .5\zw{\hss\normalsize (1)\hss}%
541+\imagfm{\jstrut\inhibitglue\gt (}%
542+\imagfm{\jstrut\gt 漢}%
543+\imagfm{\jstrut\gt )\inhibitglue}%
544+\hbox to .55\zw{\hss\normalsize (2)\hss}%
545+\imagfm{\fontsize{48}{48}\selectfont\jstrut\gt\inhibitglue (}%
546+\imagfm{\fontsize{48}{48}\selectfont\jstrut\smash{%
547+ \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr\gt 大\cr
548+ \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$1.2a$}\
549+ \hrulefill\vrule height .5ex depth .5ex\cr}}}}
550+\end{center}
551+\caption{Fonts with different metrics.}
552+\label{fig-diffmet}
553+\end{figure}
554+
555+\item[\emph{kanjiskip} and \emph{xkanjiskip}]
556+In \pTeX, the value of \emph{xkanjiskip} is controlled by a skip named
557+ |\xkanjiskip|. A well-known defect of this implementation is
558+ that the value of \emph{xkanjiskip} is not connected with the
559+ size of the currnt Japanese font. It seems that |EXTRASPACE|,
560+ |EXTRASTRETCH|, |EXTRASHRINK| parameters in a JFM are
561+ reserved for specifying the default value of
562+ \emph{xkanjiskip} in a unit of the design size, but \pTeX\
563+ did not use these parameters, actually.
564+
565+Considering this situation of p\TeX, \LuaTeX-ja can use the value of
566+ \emph{xkanjiskip} that specified in a metric. If the value of
567+ \emph{xkanjiskip} on user side (this is the value of
568+ \textsf{xkanjiskip} parameter of |\ltjsetparameter|) is
569+ |\maxdimen|, then \LuaTeX-ja use the specification from
570+ the current used metric as the actual value of
571+ \emph{xkanjiskip}. This description also applies for \emph{kanjiskip}.
572+\end{description}
573+
574+\section{Distinction of characters}
575+\label{sec:distinction_of_characters} Since \LuaTeX\ can handle Unicode
576+characters natively, it is a major problem that how we distinguish
577+Japanese characters and alphabetic characters. For example, the
578+multiplication sign (U+00D7) exists both in ISO-8859-1 (hence in Latin-1
579+Supplement in Unicode) and in the basic Japanese character set
580+JIS~X~0208. It is not desirable that this character is always treated as
581+an alphabetic character, because this symbol is often used in the sense
582+of `negative' in Japan.
583+
584+\subsection{Character ranges}
585+Before we describe the approach taken is \LuaTeX-ja, we review the
586+approach taken by u\pTeX. u\pTeX\ extends the |\kcatcode| primitive in
587+\pTeX, to use this primitive for setting how a character is treated
588+among alphabetic characters~(15), \emph{kanji}~(16), \emph{kana}~(17),
589+\emph{kanji}, \emph{Hangul}~(17), or~\emph{other CJK characters}~(18).
590+The assignment to |\kcatcode| can be done by a Unicode
591+block.\footnote{There are some exceptions. For example, U+FF00--FFEF
592+(Halfwidth and Fullwidth Forms) are divided into three blocks in recent
593+u\pTeX.}
594+
595+\LuaTeX-ja adopted a different approach. There are many Unicode blocks
596+ in Basic Multilingual Plane which are not included in
597+ Japanese fonts, therefore it is inconvenient if we process by a Unicode
598+ block. Furthermore, JIS~X~0208 are not just union of Unicode
599+ blocks; for example, the intersection of JIS~X~0208 and
600+ Latin-1 Supplement is shown in
601+ Table~\ref{tab-inter}. Considering these two points, to
602+ customize the range of Japanese characters in \LuaTeX-ja, one
603+ has to define ranges of character codes in his source in advance.
604+
605+
606+\begin{table}
607+\caption{Intersection of JIS~X~0208 and Latin-1 Supplement.}
608+\label{tab-inter}
609+\begin{center}
610+\begin{tabular}{llll}
611+\ltjjachar"A7 (U+00A7),&
612+\ltjjachar"A8 (U+00A8),&
613+\ltjjachar"B0 (U+00B0),&
614+\ltjjachar"B1 (U+00B1),\\
615+\ltjjachar"B4 (U+00B4),&
616+\ltjjachar"B6 (U+00B6),&
617+\ltjjachar"D7 (U+00D7),&
618+\ltjjachar"F7 (U+00F7)
619+\end{tabular}
620+\end{center}
621+\end{table}
622+
623+%%Example...
624+
625+We note that \LuaTeX-ja offers two additional control sequences,
626+ |\ltjjachar| and |\ltjalchar|. They are similar to |\char|
627+ primitive, however |\ltjjachar| always yields a Japanese character, provided that
628+ the argument is more than or equal to 128, and |\ltjalchar| always
629+ yields an alphabetic character, regardless of the argument.
630+
631+\subsection{Default setting of ranges}
632+Patches for plain \TeX\ and \LaTeXe\ of \LuaTeX-ja predefine 8~character
633+ranges, as shown in Table~\ref{tab-chrrng}. Almost of these ranges are
634+just the union of Unicode blocks, and determined from the Adobe-Japan1-6
635+character collection~\cite{aj16}, and JIS~X~0208. Among these 8~ranges,
636+the ranges~2, 3, 6, 7, and~8 are considered ranges of Japanese
637+characters, and others are considered ranges of alphabetic
638+characters.\footnote{Note that ranges 3~and~8 are considered ranges of
639+alphabetic characters in this paper.} We remark on ranges 2~and~8:
640+\begin{description}
641+\item[The range~2]
642+JIS~X~0208 includes Greek letters and Cyrillic letters, however, these
643+ letters cannot be used for typesetting Greek or Russian, of
644+ course. Hence it is reasonable that Greek letters and
645+ Cyrillic consist another character range.
646+\item[The range~8]
647+If one want to use 8-bit TFMs, such as T1 or TS1 encodings, he should
648+ mark this range~8 as a range of alphabetic characters by
649+\begin{quote}
650+|\ltjsetparameter{jacharrange={-8}}|
651+\end{quote}
652+This is because some 8-bit TFMs have a glyph in this range; for example,
653+ the character `\OE' is located at |"D7| in the T1 encoding. %"
654+\end{description}
655+
656+
657+\begin{table}
658+\caption{Predefined ranges in \LuaTeX-ja.}
659+\label{tab-chrrng}
660+\begin{center}
661+\begin{tabular}{@{\bf}rl}
662+1&(Additional) Latin characters which are not belonged in the range~8.\\
663+2&Greek and Cyrillic letters.\\
664+3&Punctuations and miscellaneous symbols.\\
665+4&Unicode blocks which does not intersect with Adobe-Japan1-6.\\
666+5&Surrogates and supplementary private use Areas.\\
667+6&Characters used in Japanese typesetting.\\
668+7&Characters possibly used in CJK typesetting, but not in Japanese.\\
669+8&Characters in Table~\ref{tab-inter}.
670+\end{tabular}
671+\end{center}
672+\end{table}
673+
674+\subsection{Control sequences producing Unicode characters}
675+\label{ssec-unichar}
676+
677+The \emph{fontspec} package\footnote{Preciously saying, it is the
678+\emph{xunicode} package, originally a package for \XeTeX and
679+automatically loaded by the \emph{fontspec} package.} offers various
680+control sequences that produce Unicode characters. However, these
681+control sequences as it stands cannot work correctly with the default
682+range setting of \LuaTeX-ja. For example, |\textquotedblleft| is just
683+an abbreviation of |\char"201C\relax|, and the character U+201C (LEFT %"
684+DOUBLE QUOTATION MARK) is treated as an Japanese character, because it
685+belongs to the range~3. This problem is resolved by using |\ltjalchar|
686+instead of the |\char| primitive. It is included in an optional package
687+named \texttt{luatexja-\penalty0fontspec.sty}. Figure~\ref{fig-unitxt}
688+shows several ways o typeset a character , both as a Japanese character
689+and as as an alphabetic characters.
690+
691+\begin{figure}
692+\begin{LTXexample}
693+×, \char`×, % depend on range setting
694+\ltjalchar`×, % alphabetic char
695+\ltjjachar`×, % Japanese char
696+\texttimes % alph. char (by fontspec)
697+\end{LTXexample}
698+\caption{Control sequences producing a Unicode character.}
699+\label{fig-unitxt}
700+\end{figure}
701+
702+The situation looks similar in math formulas, but in fact it differs.
703+Each control sequence that represents an ordinary symbol defined by the
704+\emph{unicode-math} package is just synonym of a character. For example,
705+the meaning of |\otimes| is just the character U+2297 (CIRCLED TIMES),
706+which is included in the range~3. However, it is difficult to define a
707+control sequence like |\ltjalUmathchar| as a counterpart of
708+|\Umathchar|, since an input like `|\sum^\ltjalUmathchar ...|' has to be
709+permitted.
710+
711+However, we couldn't develop a satisfactory solution to this problem in
712+time for this paper, due to a lack of time. We are just testing a
713+solution below:
714+\begin{itemize}
715+\item \LuaTeX-ja has a list of character codes which will be always reated as
716+ alphabetic characters in math mode. Considering 8-bit TFMs for
717+ math symbols, this list includes natural numbers between |"80| and
718+ |"FF| by default.
719+\item Redefine internal commands defined in the \emph{unicode-math}
720+ package so that
721+codes of characters which are mentioned in the \emph{unicode-math}
722+ package will be included in the list.
723+\end{itemize}
724+
725+
726+We would like to extend treatments described in this subsection to 8-bit
727+font encodings, but we leave it to further development too.
728+
729+\section{Current status of development}
730+\label{sec:current_status}
731+At the moment, \LuaTeX-ja can be used under plain \TeX, and under
732+\LaTeXe. Generally speaking, one only has to read |luatexja.sty|, by
733+|\input| command or |\usepackage| (in~\LaTeXe), if you merely want to
734+typeset Japanese characters. We look more detail by parts.
735+
736+\subsection{`Engine extension'}
737+The lowest part of \LuaTeX-ja corresponds the \pTeX\ extension as
738+\emph{an engine extension of \TeX}. We, the project menbers, think that
739+this part is almost done. There is one more feature of \LuaTeX-ja which
740+we are going to explain:
741+
742+\begin{description}
743+\item[Shifting Baseline]
744+In order to make a match between Japanese fonts and alphabetic fonts,
745+ sometimes shifting the baseline of alphabetic characters may
746+ be needed. \pTeX\ has a dimension |\ybaselineshift|, which
747+ corresponds the amount of shifting down the baseline of alphabetic
748+ characters. This is useful for Japanese-based documents, but
749+ not for documents mainly in languages with alphabetic
750+ characters.
751+
752+Hence, \LuaTeX-ja extends \pTeX's |\ybaselineshift| to Japanese
753+ characters. Namely, \LuaTeX-ja offers two parameters,
754+ \textsf{yjabaselineshift} and \textsf{yalbaselineshift}, for the
755+ amount of shifting the baseline of Japanese characters and
756+ that of alphabetic characters, respectively.
757+\begin{figure}
758+\begin{center}
759+\fontsize{40}{40}\selectfont\fboxsep0mm
760+\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth
761+\hbox to 0.9\linewidth{%
762+\hfil
763+\raise-10pt\imagfm{\jstrut 漢}%
764+\raise-10pt\imagfm{\jstrut 字}\hskip.25\zw%
765+\imagfm{p}%
766+\imagfm{h}%
767+\hfil\hfil
768+\imagfm{\jstrut 漢}%
769+\imagfm{\jstrut 字}\hskip.25\zw%
770+\raise-10pt\imagfm{p}%
771+\raise-10pt\imagfm{h}%
772+\hfil
773+}
774+\end{center}
775+
776+\caption{First example of shifting baseline.}
777+\label{fig-bls}
778+\end{figure}
779+
780+\begin{figure}
781+\begin{center}
782+\fontsize{30}{30}\selectfont\fboxsep0mm
783+\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth
784+\hbox to 0.9\linewidth{%
785+\hfil
786+\imagfm{a}%
787+\imagfm{b}\hskip.25\zw%
788+\imagfm{\jstrut 本}%
789+\imagfm{\jstrut 文}\hskip.33333\zw%
790+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut\inhibitglue (}%
791+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 注}%
792+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 釈}\hskip.1666667\zw%
793+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont c}%
794+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont o}%
795+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}%
796+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}%
797+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont e}%
798+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont n}%
799+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont t}%
800+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut )\inhibitglue}%
801+\hskip.33333\zw%
802+\imagfm{\jstrut 本}%
803+\imagfm{\jstrut 文}%
804+\hfil
805+}
806+\end{center}
807+
808+\caption{Second example of shifting baseline.}
809+\label{fig-small}
810+\end{figure}
811+
812+An example output is shown in Figure~\ref{fig-bls}. The left half is the
813+ output when \textsf{yjabaselineshift} is positive, hence the
814+ baseline of Japanese characters is shifted down. On the other
815+ hand, the right half is the output when
816+ \textsf{yalbaselineshift} is positive, hence the baseline of
817+ alphabetic characters is shifted down. Figure~\ref{fig-small}
818+ shows an intresting use of these parameters.
819+
820+\end{description}
821+Note that \LuaTeX-ja doesn't support vertical typesetting, \emph{tategaki}, for now.
822+
823+\subsection{Patches for plain \TeX\ and \LaTeXe}
824+\pTeX\ has a patch for plain \TeX, namely |ptex.tex|, that for \LaTeXe\
825+macro (this patch and \LaTeXe\ consist \emph{p\LaTeXe}), and
826+|kinsoku.tex| which includes the default setting of \emph{kinsoku
827+shori}, the Japanese hyphenation. We ported them to \LuaTeX-ja, except
828+the codes related to vertical typesetting, because \LuaTeX-ja doesn't
829+support vertical typesetting yet. We remark one point related to the
830+porting:
831+\begin{description}
832+
833+\item[Behavior of\/ {\tt\char92fontfamily\/}]
834+The control sequence |\fontfamily| in p\LaTeXe\ changes the current alphabetic
835+ font family and/or the current Japanese font family,
836+ depending the argument. More concretely,
837+ |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the
838+ current alphabetic font family to $\langle\hbox{\it
839+ arg\/}\rangle$, if and only if one of the following
840+ conditions are satisfied:
841+\begin{itemize}
842+\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$ in
843+ \emph{some} alphabetic encoding already defined in the document.
844+\item There exists an alphabetic encoding $\langle\hbox{\it
845+ enc\/}\rangle$ already defined in the document such that a font
846+ definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it
847+ arg\/}\rangle$|.fd| (all lowercase) exists.
848+\end{itemize}
849+The same criterion is used for changing Japanese font family.
850+
851+To work this behavior well, a list of all (alphabetic) encodings defined
852+ already in the document is needed. However, since \LuaTeX-ja
853+ is loaded as a package, \LuaTeX-ja cannot have this list.
854+ Hence \LuaTeX-ja adopted a different approach, namely
855+ |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the
856+ current alphabetic font family to $\langle\hbox{\it
857+ arg\/}\rangle$, if and only if:
858+\begin{itemize}
859+\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$
860+ in the current alphabetic encoding $\langle\hbox{\it
861+ enc\/}\rangle$ already defined in the document.
862+\item A font definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it
863+ arg\/}\rangle$|.fd| (all lowercase) exists.
864+\end{itemize}
865+
866+
867+\end{description}
868+
869+
870+
871+\subsection{Classes for Japanese documents}
872+To produce `high-quality' Japanese documents, we need not only that
873+Japanese characters are correctly placed, but also class files for
874+Japanese documents. Two major families of classes are widely used in Japan:
875+\emph{jclasses} which is distributed with the official p\LaTeXe\ macros,
876+and \emph{jsclasses}. At the present, \LuaTeX-ja
877+simply contains their counterparts: \emph{ltjclasses} and
878+\emph{ltjsclasses}. However, the policy on classes is not determined
879+now, and we hope to have another family of classes which are useful for
880+commercial printing. In the author's opinion, \emph{ltjclasses} is
881+better to stay as an example of porting of class files for \pTeX\ to
882+\LuaTeX-ja.
883+
884+\subsection{Patches for packages}
885+Apart from patches for the \LaTeXe~kernel and classes for Japanese
886+documents, we need to make patches for several packages. At the present,
887+we considered the following packages, and made patches or porting for
888+the former two packages.
889+
890+\begin{description}
891+\item[The \emph{fontspec} package] The \emph{fontspec} package is built
892+ on NFSS2, hence control sequences offered by the
893+ \emph{fontspec} package, such as |\setmainfont|, are only
894+ effective for alphabetic fonts if \LuaTeX-ja is loaded.
895+ \texttt{luatexja-\penalty0fontspec.sty} (not automatically
896+ loaded) offers these counterparts for Japanese fonts, with
897+ additional `j' in the name of control sequences, such as
898+ |\setmainjfont|. As described in
899+ Subsection~\ref{ssec-unichar}, it also includes a patch for
900+ control sequences producing Unicode characters.
901+
902+\item[The \emph{otf} package]
903+This package is widely used in \pTeX\ for typesetting characters which is
904+not in JIS~X~0208, and for using more than one weight in \emph{mincho}
905+and \emph{gothic} font families. Therefore \LuaTeX-ja supports features
906+in the \emph{otf} package, by loading \texttt{luatexja-\penalty0otf.sty}
907+ manually. Note that characters by |\UTF{xxxx}| and
908+ |\CID{xxxx}| are not appended to the current list as a
909+ \emph{glyph\_node}, to avoid from callbacks by the
910+ \emph{luaotfload} package. We have another remark; |\CID|
911+ does not work with TrueType fonts, since |\CID| use the
912+ conversion table between CID and the glyph order of the
913+ current Japanese font.
914+
915+\item[The \emph{listings} package]
916+It is known for users of \pTeX\ that there is a patch |jlisting.sty| for
917+ the \emph{listings} package, to use Japanese characters in
918+ the |lstlisting| environment. Generally speaking, it also can
919+ be used in \LuaTeX-ja. However, it seems to be that a
920+ Japanese character after a space does not recieve any process
921+ of the \emph{listings} package; this is inconvinient when we
922+ use the \emph{showexpl} package.
923+
924+There is another way to use characters above 256 with the
925+ \emph{listings} package (described in\cite{apl}). However,
926+ this method is not suitable for Japanese, since the number of
927+ Japanese characters is very large. We hope that the
928+ \emph{listings} package will be able to handle all characters above
929+ 256 without any patch, in the future.
930+
931+
932+\end{description}
933+
934+
935+
936+\section{Implementation}
937+\label{sec:implementation}
938+\subsection{Handling of Japanese fonts}
939+In \pTeX, there are three slots for maintaining current fonts, namely
940+|\font| for alphabetic fonts, |\jfont| for Japanese fonts (in horizontal
941+direction) and |\tfont| for Japanese fonts (in vertical direction). With
942+these slots, we can manage the current font for alphabetic characters
943+and that for Japanese characters separately in \pTeX. However, \LuaTeX\
944+has only one slot for maintaining the current font, as \TeX82. This
945+situation leads a problem: how can we maintain the `current Japanese
946+font'?
947+
948+There are three approaches for this problem. One approach is to make a
949+mapping table from alphabetic fonts to corresponding Japanese fonts
950+(here we don't assume that NFSS2 is available). Another approach is
951+that we always use composite fonts with alphabetic fonts and Japanese
952+fonts. The third approach is that the information of the current
953+Japanese font is stored in an attribute. We adopted the third approach,
954+since \LuaTeX-ja is much affected by \pTeX\ as we noted in
955+Subsection~\ref{ssec-pol}.
956+
957+As in Figure~\ref{fig-jfdef}, \LuaTeX-ja uses |\jfont| for defining
958+Japanese font, as \pTeX. However, because the information of the current
959+Japanese font is stored into an attribute, control sequences defined by
960+|\jfont| (e.g.,~|\foo| and |\bar| in Figure~\ref{fig-jfdef}) is
961+not representing a font by the means of \TeX82. In other words, each of
962+these control sequences is just an assignment to an attribute, therefore
963+they cannot be an argument of |\the|, |\fontname|, nor |\textfont|.
964+
965+
966+Callbacks by the \emph{luaotfload} package, e.g.,~replacement of glyphs
967+according to font features, are executed just after `Examination of
968+Stack Level' (see Subsections \ref{ssec-over}~and~\ref{ssec-stack}). Note that calculation of
969+character classes for each Japanese character is done \emph{after} the
970+these callbacks for now.
971+
972+\subsection{Stack management}
973+\label{ssec-stack}
974+
975+As we noted in Subsection~\ref{ssec-csname}, parameters that the values
976+at the end of a horizontal box or that of a paragraph are valid in
977+whole box or paragraph, such as \emph{kanjiskip}, cannot be implemented
978+by internal integers or registers of other types in \TeX. We explain it
979+in this subsection.
980+
981+\begin{figure}
982+\begin{lstlisting}
983+void package(int c)
984+{
985+ ...
986+ d = box_max_depth;
987+ unsave();
988+ save_ptr -= 4;
989+ if (cur_list.mode_field == -hmode) {
990+ cur_box = filtered_hpack(cur_list.head_field,
991+ cur_list.tail_field, saved_value(1),
992+ saved_level(1), grp, saved_level(2));
993+ subtype(cur_box) = HLIST_SUBTYPE_HBOX;
994+ } else {
995+\end{lstlisting}
996+\caption{An extract of a CWEB-source \texttt{tex/packaging.w} of \LuaTeX.}
997+\label{fig-ltsrc}
998+\end{figure}
999+
1000+Figure~\ref{fig-ltsrc} is an extract of a CWEB-source
1001+\texttt{tex/packaging.w} of \LuaTeX\ (SVN revision 4358). This function
1002+is called just when an explicit |\hbox{...}| or |\vbox{...}| is ended, and
1003+the function |filtered_hpack()| is where the |hpack_filter| and then the
1004+actual `hpack' process are performed. Notice that the |unsave()|
1005+function is called before |filtered_hpack()|. This is the problem;
1006+because of |unsave()|, we can retrive only the values of registers
1007+\emph{outside} the box, even in the |hpack_filter| callback.
1008+
1009+To cope with this problem, \LuaTeX-ja has its own stack system, based on
1010+Lua codes in \cite{stack-mail}. Furthermore, \emph{whatsit} nodes whose
1011+\emph{user\_id} is 30112 (\emph{stack\_node}, for short) will be
1012+appended to the current horizontal list each time the current stack
1013+level is incremented, and their values are the values of
1014+|\currentgrouplevel| at that time. In the beginning of the |hpack_filter|
1015+callback, the list in question is traversed to determine whether the
1016+stack level at the end of the list and that outside the box coincides.
1017+
1018+Let $x$ be the value of |\currentgrouplevel|, and $y$ be the current
1019+stack level, both inside the |hpack_filter| callback, i.e.,~outside a
1020+horizontal box. Consider a list which represents the content of the box,
1021+then we have:
1022+\begin{itemize}
1023+\item A \emph{stack\_node} whose value is $x+1$ (because all materials in
1024+ the box are included in a group |\hbox{...}|, the value is at
1025+ least $x+1$) in the list represents an assignment related to the
1026+ stack system in just top-level of the list, like
1027+\begin{quote}
1028+\begin{verbatim}
1029+\hbox{...(assignment)...}
1030+\end{verbatim}
1031+\end{quote}
1032+In this case, the current stack level is incremented to $y+1$ after the assignment.
1033+\item A \emph{stack\_node} whose value is more than $x+1$ in the list represents
1034+an assignment inside another group contained in the box. For example,
1035+ the following input creates
1036+a \emph{stack\_node} whose value is $x+3=(x+1)+2$:
1037+\begin{quote}
1038+\begin{verbatim}
1039+\hbox{...{...{...(assignment)}...}...}
1040+\end{verbatim}
1041+\end{quote}
1042+\end{itemize}
1043+Thus, we can conclude that the stack level at the end of the list is
1044+$y+1$, if and only if there is a \emph{stack\_node} whose value is
1045+$x+1$. Otherwise, the stack level is just $y$.
1046+
1047+\subsection{Adjustment of the position of Japanese characters}
1048+\label{ssec-width}
1049+
1050+The size of a glyph specified in a metric and that of a real font
1051+usually differ. For example, the letter `\inhibitglue【' is half-width
1052+in |jfm-ujis.lua| or |jis.tfm|, while this letter is full-width like `【'
1053+in most TrueType fonts used in Japanese typesetting, such as
1054+IPA~Mincho. Hence the adjustment of position of such glyphs is
1055+needed. In the context of \pTeX, this process was performed using virtual fonts.
1056+
1057+On the other hand, Lua\TeX-ja does the adjustment by encapsuling a glyph
1058+into a horizontal box. There are two main reasons why we adopted this
1059+method; one is that we feared Lua codes for coexisting with callbacks by
1060+the |luaotfload| package would be large if we use virtual fonts, and the
1061+other is to cope with shifting of the baseline of characters at the
1062+same time.
1063+
1064+\begin{figure}
1065+\begin{center}\unitlength=9pt\small
1066+\begin{picture}(15,12)(-1,-3)
1067+
1068+\color{grayx}% real glyph
1069+\put(-1,-1.5){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
1070+
1071+\color{black}% real glyph :step1
1072+\thicklines
1073+\put(-1,-1.5){\line(0,1){7}\line(0,-1){2.5}}
1074+\put(5,-1.5){\line(0,1){7}\line(0,-1){2.5}}
1075+\put(-1,5.5){\line(1,0){6}}
1076+\put(-1,-4){\line(1,0){6}}
1077+\put(-1,0){\makebox(0,0)[r]{\strut$R$\,}}
1078+
1079+\thicklines
1080+\put(0,0){\vector(0,1){9}\line(0,-1){3}\vector(1,0){12}}
1081+\put(12,9){\makebox(0,0)[rt]{\strut$M$\,}}
1082+\put(12,0){\line(0,1){9}\vector(0,-1){3}}
1083+\put(0,9){\line(1,0){12}}
1084+\put(0,-3){\line(1,0){12}}
1085+\put(0.2,4.5){\makebox(0,0)[l]{\texttt{height}}}
1086+\put(12.2,-1.5){\makebox(0,0)[l]{\texttt{depth}}}
1087+\put(6,0.2){\makebox(0,0)[b]{\texttt{width}}}
1088+
1089+\thicklines
1090+\put(3,0){\line(0,1){7}\line(0,-1){2.5}\line(1,0){6}}
1091+\put(9,0){\line(0,1){7}\line(0,-1){2.5}}
1092+\put(3,7){\line(1,0){6}}
1093+\put(3,-2.5){\line(1,0){6}}
1094+\newsavebox{\eqdist}
1095+\savebox{\eqdist}(0,0)[c]{%
1096+ \thinlines
1097+ \put(-0.08,0.2){\line(0,-1){0.4}}%
1098+ \put(0.08,0.2){\line(0,-1){0.4}}}
1099+\put(1.5,0){\usebox{\eqdist}}
1100+\put(10.5,0){\usebox{\eqdist}}
1101+
1102+\thicklines
1103+\put(3,-1.5){\vector(-1,0){4}}
1104+\put(1,-1.7){\makebox(0,0)[t]{\texttt{left}}}
1105+\put(3,0){\vector(0,-1){1.5}}
1106+\put(3.2,-0.75){\makebox(0,0)[l]{\texttt{down}}}
1107+\end{picture}
1108+\end{center}
1109+\caption{The position of the `real' glyph.}
1110+\label{fig-pos}
1111+\end{figure}
1112+
1113+Figure~\ref{fig-pos} shows the adjustment process. A large square $M$ is
1114+the imaginary body specified in the metric, and a vertical
1115+rectangle is the imaginary body of a real glyph. First, the real glyph
1116+is aligned with respect to the width of $M$. In the figure, the real
1117+glyph is aligned `middle'; this setting is useful for the full-width
1118+middle dot `・'. We have other settings, `left' and `right'.
1119+After that, it is shifted according to the value of |left| and |down|,
1120+which are specified in the metric, too. The final position of the real glyph
1121+is shown by the gray rectangle~$R$. If the amount of shifting the baseline is
1122+not zero, $M$ (and hence the real glyph) is shifted by that amount.
1123+
1124+We would like to remark briefly on the vertical position of a real
1125+glyph. A JFM (or a metric used in \LuaTeX-ja) and a real font used for
1126+it may have different height or depth. In that case, it may look better
1127+if the real glyph is shifted vertically to match the height-depth ratio
1128+specified in the metric, while any vertical adjustment except the
1129+adjustment by the |down| value does not performed in the present
1130+implementation of \LuaTeX-ja . This situation is carefully studied by
1131+Otobe~\cite{min10}. Here the policy on this problem is not determined
1132+now, however we would like to offer several solutions in future
1133+development.
1134+
1135+\section{Conclusion}
1136+We have discussed about our \LuaTeX-ja package, which is much affected
1137+by \pTeX. For now, it can be used for experimental use, however there
1138+are much refinements which are needed for regular use. The author hopes
1139+that this paper and \LuaTeX-ja project contribute the typesetting Japanese,
1140+and possibly other Asian languages, under \LuaTeX.
1141+
1142+\section*{Acknowledgements}
1143+The author would like to thank Ken Nakano and Hideaki Togashi for their
1144+development of ASCII \pTeX. The author is very grateful to Haruhiko
1145+Okumura for his leadership in the Japanese \TeX\ community. The author
1146+is also very grateful to members of \LuaTeX-ja project team for their
1147+valuable cooperation in development.
1148+
1149+%%% The style of the bibiliogrphy is `amsplain'.
1150+\providecommand{\bysame}{\leavevmode\hbox to3em{\hrulefill}\thinspace}
1151+\providecommand{\href}[2]{#2}
1152+\begin{thebibliography}{99}
1153+
1154+\bibitem{aj16}
1155+Adobe Systems Incorporated, \emph{Adobe-Japan1-6 Character Collection
1156+ for CID-Keyed Fonts}, Technical Note~\#5078, 2004.
1157+\url{http://partners.adobe.com/public/developer/en/font/5078.Adobe-Japan1-6.pdf}
1158+
1159+\bibitem{ptex}
1160+ASCII MEDIA WORKS,アスキー日本語\TeX\ (\pTeX).\url{http://ascii.asciimw.jp/pb/ptex/}
1161+
1162+\bibitem{apl}
1163+John Baker, \emph{Typesetting UTF8 APL code with the \LaTeX\ lstlisting package}.
1164+\url{http://bakerjd99.wordpress.com/2011/08/15/}
1165+
1166+\bibitem{omega}
1167+Jin-Hwan~Cho and Haruhiko Okumura, \emph{Typesetting CJK Languages with Omega},
1168+\TeX, XML, and Digital Typography, Lecture Notes in Computer Science, vol.~3130,
1169+Springer, 2004, 139--148.
1170+
1171+\bibitem{joylua}
1172+Yannis Haralambous. \emph{The Joy of \LuaTeX}. \url{http://luatex.bluwiki.com/}
1173+
1174+\bibitem{jisx4051}
1175+Japanese Industrial Standards Committee. \emph{JIS~X~4051: Formatting
1176+ rules for Japanese documents}, 1993, 1995, 2004.
1177+
1178+\bibitem{eptex}
1179+北川弘典,$\varepsilon$-\pTeX についてのwiki.
1180+\url{http://sourceforge.jp/projects/eptex/wiki/FrontPage}
1181+
1182+\bibitem{luaums}
1183+北川弘典,\LuaTeX で日本語.
1184+\url{http://oku.edu.mie-u.ac.jp/tex/mod/forum/discuss.php?d=378}
1185+
1186+\bibitem{luatexref}
1187+\LuaTeX\ development team, \emph{The \LuaTeX\ reference}.
1188+\url{http://www.luatex.org/svn/trunk/manual/luatexref-t.pdf} (snapshot of SVN trunk)
1189+
1190+\bibitem{man}
1191+\LuaTeX-ja project team, \emph{The \LuaTeX-ja package}.
1192+Not completed for now. Available at |doc/man-en.pdf| (in English) or
1193+ |doc/man-ja.pdf| (in Japanese)
1194+in the Git repository.
1195+
1196+\bibitem{luajp-test}
1197+香田温人,\LuaTeX と日本語.
1198+\url{http://www1.pm.tokushima-u.ac.jp/~kohda/tex/luatex-old.html}
1199+
1200+\bibitem{luajalayout}
1201+前田一貴,luajalayout パッケージ---Lua\LaTeX によ
1202+ る日本語組版---.
1203+\url{http://www-is.amp.i.kyoto-u.ac.jp/lab/kmaeda/lualatex/luajalayout/}
1204+
1205+\bibitem{jsclasses}
1206+奥村晴彦,p\LaTeXe 新ドキュメントクラス.
1207+\url{http://oku.edu.mie-u.ac.jp/~okumura/jsclasses/}
1208+
1209+\bibitem{ptexjp}
1210+Haruhiko Okumura, \emph{\pTeX\ and Japanese Typesetting},
1211+ The Asian Journal of \TeX\ \textbf{2}~(2008), 43--51.
1212+
1213+\bibitem{min10}
1214+乙部厳己,min10フォントについて.
1215+\url{http://argent.shinshu-u.ac.jp/~otobe/tex/files/min10.pdf}
1216+
1217+\bibitem{otf}
1218+齋藤修三郎,Open Type Font用VF.
1219+\url{http://psitau.kitunebi.com/otf.html}
1220+
1221+\bibitem{stack-mail}
1222+Jonathan Sauer, \emph{[Dev-luatex] tex.currentgrouplevel}.
1223+\url{http://www.ntg.nl/pipermail/dev-luatex/2008-August/001765.html}
1224+
1225+\bibitem{uptex}
1226+Takuji Tanaka, \emph{u\pTeX, up\LaTeX---unicode version of \pTeX, p\LaTeX}.
1227+\url{http://homepage3.nifty.com/ttk/comp/tex/uptex_en.html}
1228+
1229+\bibitem{ptexenc}
1230+Nobuyuki Tsuchimura, \emph{Development of a Japanese \TeX\ Distribution~`ptetex3'},
1231+Computer Software\ \textbf{24} (2007), no.~4, 40--50, (in Japanese).
1232+
1233+\bibitem{w3c}
1234+W3C Working Group, \emph{Requirements for Japanese Text Layout}.
1235+\url{http://www.w3.org/TR/jlreq/}
1236+\end{thebibliography}
1237+
1238+\end{document}
旧リポジトリブラウザで表示