I absolutely love Esperanto and love writing and translating to the language. One issue that one often encounters is that a lot of fonts do not support Esperanto characters - the famous ĉapelliteroj - such as Ŝ, Ĉ or Ĥ. When I translated Mario Kart 8 Deluxe into Esperanto I had to manually create these letters, which wasn’t too hard especially given that the game used raster images for its letters, but often we are working with vector images - and while yes it is possible to manually add the missing letters - this is very hard and some fonts are just very challenging to copy because of their unique style!
But this problem isn’t only limited to Esperanto. Most European languages use special letters. German has the famous Umlauts Ü, Slovenian has Č, French has Ê and Portuguese has Ã. And let’s not even talk about languages which use completely different fonts like Thai, Bulgarian or Mandarin. It’s not even limited to the translation community either! Imagine you find a font online that you really like and that you would love to use in your projects, but alas! The font only includes English letters…
All of this leads to the question: Is it possible to automatically generate new letters in the style of the font, without having to modify anything manually? This is the question I sought to answer in my latest project: Tiparilo.
Introduction#
The plan#
The plan would be to use a generative neural network that would be trained on fonts which cover a great deal of the UNICODE characters. Then the user could load a .ttf
or .otf
file and select which missing characters to infer. The generativee model would then be trained on the new style and infer the vector shape of the missing characters and add them to the .ttf
or .otf
file.
Raster vs Vector Graphics#
First and foremost we have to differentiate between raster and vector graphics.
Raster images have fixed resolutions and each pixel has a fixed colour value - raster images appear pixelated when their size is increased.
Vector images on the other hand have unlimited resolutions, they are stored as geometric functions and can be resized as much as wanted and will still appear crispy.
Old hardware often relied on bitmaps - a very simple form of a raster image - for letters, but most modern computers use vector images for letters, because one can increase the size of each letter without the letters becoming pixelated. This is important, because a lot of generative networks work with raster images, but I want to generate new vector images based on existing ones.
Most common letter classifications#
Sans-Serif vs Serif Groteque, Neo-grotesque, geometric, humanist and mixed
TTF vs OTF#
The most common font formats are TTF (TrueTypeFont) and OTF (OpenTypeFont). TrueType is a font standard which was developed by Apple in the 1980s, but is now used across many operating systems. TrueTypeFonts save their characters in the form of line segments and quadratic bézier curves:
$$ p(t)= (1-t)^2p_0 + 2t(1-t)p_1 + t^2 p_2 $$
Complex glyphs are then just a sum of bézier curves and straight lines:
TTFs save a lot of information! From their relative position and size in a coordinate system, metadata to kerning and hinting.
Kerning is pretty special - the distance between letters is actually not the same, we have to adapt their relative distances so that we perceive an equal spacing.
Finally there is also font hinting, which uses clever tricks to adapt the letters on low-resolution displays - for example by making the left edges red and the right edges blue.
OTFs are the other major font format used in
UNICODE#
Generative Models#
=> GANs => cGANs => VAEs => Diffusion
Existing attempts and challenges#
https://github.com/mansgreback/ai-typography
The famous letter Ĥ presents a whole series of new challenges: There are differing conventions where certain accents should be placed
Implementation#
Fonttools#
One of the first things to do is to see if we can access and extract letters from the font files. We need to install Fonttools with pip (and also svgwrite to write the output to svgs). I’m going to use the Game Boy Boot font by Akihiro which I also use for my website.
from fontTools.ttLib import TTFont
from fontTools.pens.svgPathPen import SVGPathPen
import svgwrite
path = 'Gbboot.ttf'
letter_a = 'a'
svg_path = 'a.svg'
def a_to_svg(path, letter, output_path):
font = TTFont(path)
glyphSet = font.getGlyphSet()
pen = SVGPathPen(glyphSet)
glyph = glyphSet[letter]
glyph.draw(pen)
# SVG
dwg = svgwrite.Drawing(output_path, size=('500px', '500px'))
path_data = pen.getCommands()
dwg.add(dwg.path(d=path_data, fill='black'))
dwg.save()
font.close()
a_to_svg(path, letter_a, svg_path)
Executing this code gives us… 9?
What if we want to extract the capital letter A
?
Aha the letters are upside down! Perhaps this is because svg
and ttf
use different y-axis coordinate systems, but the choice of coordinate systems is not important for this undertaking. What matters is that we can print the letters and check if the tool that is being built works as it should!
Setting up the cGAN#
Sources#
- Noun Project Team: Raster vs. Vector. Understanding File Formats for Design
- Apple: TrueType-Reference-Manual