Architecture description
This commit is contained in:
parent
533d4d57c6
commit
a16726ae66
171
ARCHITECTURE.md
Normal file
171
ARCHITECTURE.md
Normal file
@ -0,0 +1,171 @@
|
||||
# Typst Compiler Architecture
|
||||
Wondering how to contribute or just curious how Typst works? This document
|
||||
covers the general architecture of Typst's compiler, so you get an understanding
|
||||
of what's where and how everything fits together.
|
||||
|
||||
The source-to-PDF compilation process of a Typst file proceeds in four phases.
|
||||
|
||||
1. **Parsing:** Turns a source string into a syntax tree.
|
||||
2. **Evaluation:** Turns a syntax tree and its dependencies into content.
|
||||
4. **Layout:** Layouts content into frames.
|
||||
5. **Export:** Turns frames into an output format like PDF or a raster graphic.
|
||||
|
||||
The Typst compiler is _incremental:_ Recompiling a document that was compiled
|
||||
previously is much faster than compiling from scratch. Most of the hard work is
|
||||
done by [`comemo`], an incremental compilation framework we have written for
|
||||
Typst. However, the compiler is still carefully written with incrementality in
|
||||
mind. Below we discuss the four phases and how incrementality affects each of
|
||||
them.
|
||||
|
||||
|
||||
## Parsing
|
||||
The syntax tree and parser are located in `src/syntax`. Parsing is a pure
|
||||
function `&str -> SyntaxNode` without any further dependencies. The result is a
|
||||
concrete syntax tree reflecting the whole file structure, including whitespace
|
||||
and comments. Parsing cannot fail. If there are syntactic errors, the returned
|
||||
syntax tree contains error nodes instead. It's important that the parser deals
|
||||
well with broken code because it is also used for syntax highlighting and IDE
|
||||
functionality.
|
||||
|
||||
**Typedness:**
|
||||
The syntax tree is untyped, any node can have any `SyntaxKind`. This makes it
|
||||
very easy to (a) attach spans to each node (see below), (b) traverse the tree
|
||||
when doing highlighting or IDE analyses (no extra complications like a visitor
|
||||
pattern). The `typst::syntax::ast` module provides a typed API on top of
|
||||
the raw tree. This API resembles a more classical AST and is used by the
|
||||
interpreter.
|
||||
|
||||
**Spans:**
|
||||
After parsing, the syntax tree is numbered with _span numbers._ These numbers
|
||||
are unique identifiers for syntax nodes that are used to trace back errors in
|
||||
later compilation phases to a piece of syntax. The span numbers are ordered so
|
||||
that the node corresponding to a number can be found quickly.
|
||||
|
||||
**Incremental:**
|
||||
Typst has an incremental parser that can reparse a segment of markup or a
|
||||
code/content block. After incremental parsing, span numbers are reassigned
|
||||
locally. This way, span numbers further away from an edit stay mostly stable.
|
||||
This is important because they are used pervasively throughout the compiler,
|
||||
also as input to memoized functions. The less they change, the better for
|
||||
incremental compilation.
|
||||
|
||||
|
||||
## Evaluation
|
||||
The evaluation phase lives in `src/eval`. It takes a parsed `Source` file and
|
||||
evaluates it to a `Module`. A module consists of the `Content` that was written
|
||||
in it and a `Scope` with the bindings that were defined within it.
|
||||
|
||||
A source file may depend on other files (imported sources, images, data files),
|
||||
which need to be resolved. Since Typst is deployed in different environments
|
||||
(CLI, web app, etc.) these system dependencies are resolved through a general
|
||||
interface called a `World`. Apart from files, the world also provides
|
||||
configuration and fonts.
|
||||
|
||||
**Interpreter:**
|
||||
Typst implements a tree-walking interpreter. To evaluate a piece of source, you
|
||||
first create a `Vm` with a scope stack. Then, the AST is recursively evaluated
|
||||
through trait impls of the form `fn eval(&self, vm: &mut Vm) -> Result<Value>`.
|
||||
An interesting detail is how closures are dealt with: When the interpreter sees
|
||||
a closure / function definition, it walks the body of the closure and finds all
|
||||
accesses to variables that aren't defined within the closure. It then clones the
|
||||
values of all these variables (it _captures_ them) and stores them alongside the
|
||||
closure's syntactical definition in a closure value. When the closure is called,
|
||||
a fresh `Vm` is created and its scope stack is initialized with the captured
|
||||
variables.
|
||||
|
||||
**Incremental:**
|
||||
In this phase, incremental compilation happens at the granularity of the module
|
||||
and the closure. Typst memoizes the result of evaluating a source file across
|
||||
compilations. Furthermore, it memoizes the result of calling a closure with a
|
||||
certain set of parameters. This is possible because Typst ensures that all
|
||||
functions are pure. The result of a closure call can be recycled if the closure
|
||||
has the same syntax and captures, even if the closure values stems from a
|
||||
different module evaluation (i.e. if a module is reevaluated, previous calls to
|
||||
closures defined in the module can still be reused).
|
||||
|
||||
|
||||
## Layout
|
||||
The layout phase takes `Content` and produces one `Frame` per page for it. To
|
||||
layout `Content`, we first have to _realize_ it by applying all relevant show
|
||||
rules to the content. Since show rules may be defined as Typst closures,
|
||||
realization can trigger closure evaluation, which in turn produces content that
|
||||
is recursively realized. Realization is a shallow process: While collecting list
|
||||
items into a list that we want to layout, we don't realize the content within
|
||||
the list items just yet. This only happens lazily once the list items are
|
||||
layouted.
|
||||
|
||||
When we a have realized the content into a layoutable
|
||||
node, we can then layout it into _regions,_ which describe the space into which
|
||||
the content shall be layouted. Within these, a node is free to layout itself
|
||||
as it sees fit, returning one `Frame` per region it wants to occupy.
|
||||
|
||||
**Introspection:**
|
||||
How content layouts (and realizes) may depend on how _it itself_ is layouted
|
||||
(e.g., through page numbers in the table of contents, counters, state, etc.).
|
||||
Typst resolves these inherently cyclical dependencies through the _introspection
|
||||
loop:_ The layout phase runs in a loop until the results stabilize. Most
|
||||
introspections stabilize after one or two iterations. However, some may never
|
||||
stabilize, so we give up after five attempts.
|
||||
|
||||
**Incremental:**
|
||||
Layout caching happens at the granularity of a node. This is important because
|
||||
overall layout is the most expensive compilation phase, so we want to reuse as
|
||||
much as possible.
|
||||
|
||||
|
||||
## Export
|
||||
Exporters live in `src/export`. They turn layouted frames into an output file
|
||||
format.
|
||||
|
||||
- The PDF exporter takes layouted frames and turns them into a PDF file.
|
||||
- The built-in renderer takes a frame and turns it into a pixel buffer.
|
||||
- HTML export does not exist yet, but will in the future. However, this requires
|
||||
some complex compiler work because the export will start with `Content`
|
||||
instead of `Frames` (layout is the browser's job).
|
||||
|
||||
|
||||
## IDE
|
||||
The `src/ide` module implements IDE functionality for Typst. It builds heavily
|
||||
on the other modules (most importantly, `syntax` and `eval`).
|
||||
|
||||
**Syntactic:**
|
||||
Basic IDE functionality is based on a file's syntax. However, the standard
|
||||
syntax node is a bit too limited for writing IDE tooling. It doesn't provide
|
||||
access to its parents or neighbours. This is a fine for an evaluation-like
|
||||
recursive traversal, but impractical for IDE use cases. For this reason, there
|
||||
is an additional abstraction on top of a syntax node called a `LinkedNode`,
|
||||
which is used pervasively across the `ide` module.
|
||||
|
||||
**Semantic:**
|
||||
More advanced functionality like autocompletion requires semantic analysis of
|
||||
the source. To gain semantic information for things like hover tooltips, we
|
||||
directly use other parts of the compiler. For instance, to find out the type of
|
||||
a variable, we evaluate and realize the full document equipped with a `Tracer`
|
||||
that emits the variable's value whenever it is visited. From the set of
|
||||
resulting values, we can then compute the set of types a value takes on. Thanks
|
||||
to incremental compilation, we can recycle large parts of the compilation that
|
||||
we had to do anyway to typeset the document.
|
||||
|
||||
**Incremental:**
|
||||
Syntactic IDE stuff is relatively cheap for now, so there are no special
|
||||
incrementality concerns. Semantic analysis with a tracer is relatively
|
||||
expensive. However, large parts of a traced analysis compilation can reuse
|
||||
memoized results from a previous normal compilation. Only the module evaluation
|
||||
of the active file and layout code that somewhere within evaluates source code
|
||||
in the active file needs to re-run. This is all handled automatically by
|
||||
`comemo` because the tracer is wrapped in a `comemo::TrackedMut` container.
|
||||
|
||||
|
||||
## Tests
|
||||
Typst has an extensive suite of integration tests. A test file consists of
|
||||
multiple tests that are separated by `---`. For each test file, we store a
|
||||
reference image defining what the compiler _should_ output. To manage the
|
||||
reference images, you can use the VS code extension in `tools/test-helper`.
|
||||
|
||||
The integration tests cover parsing, evaluation, realization, layout and
|
||||
rendering. PDF output is sadly untested, but most bugs are in earlier phases of
|
||||
the compiler; the PDF output itself is relatively straight-forward. IDE
|
||||
functionality is also mostly untested. PDF and IDE testing should be added in
|
||||
the future.
|
||||
|
||||
[`comemo`]: https://github.com/typst/comemo/
|
@ -35,7 +35,7 @@ currently in public beta.
|
||||
## Example
|
||||
This is what a Typst file with a bit of math and automation looks like:
|
||||
<p align="center">
|
||||
<img alt="Example" width="900" src="https://user-images.githubusercontent.com/17899797/226110084-a4e7eff2-33cb-44b3-aced-2bef2e52148d.png"/>
|
||||
<img alt="Example" width="900" src="https://user-images.githubusercontent.com/17899797/226122655-db82e9fa-6942-47a5-9e14-a67183617f6f.png"/>
|
||||
</p>
|
||||
|
||||
Let's disect what's going on:
|
||||
@ -165,13 +165,13 @@ instant preview. To achieve these goals, we follow three core design principles:
|
||||
Luckily we have [`comemo`], a system for incremental compilation which does
|
||||
most of the hard work in the background.
|
||||
|
||||
[docs]: https://typst.app/docs
|
||||
[docs]: https://typst.app/docs/
|
||||
[app]: https://typst.app/
|
||||
[discord]: https://discord.gg/2uDybryKPe
|
||||
[show]: https://typst.app/docs/reference/styling/#show-rules
|
||||
[math]: https://typst.app/docs/reference/math/
|
||||
[scripting]: https://typst.app/docs/reference/scripting/
|
||||
[rust]: https://rustup.rs
|
||||
[releases]: https://github.com/typst/typst/releases
|
||||
[rust]: https://rustup.rs/
|
||||
[releases]: https://github.com/typst/typst/releases/
|
||||
[architecture]: https://github.com/typst/typst/blob/main/ARCHITECTURE.md
|
||||
[`comemo`]: https://github.com/typst/comemo/
|
||||
|
@ -380,8 +380,11 @@ impl<'a, 'v, 't> Builder<'a, 'v, 't> {
|
||||
let Some(doc) = &mut self.doc else { return Ok(()) };
|
||||
if !self.flow.0.is_empty() || (doc.keep_next && styles.is_some()) {
|
||||
let (flow, shared) = mem::take(&mut self.flow).0.finish();
|
||||
let styles =
|
||||
if shared == StyleChain::default() { styles.unwrap() } else { shared };
|
||||
let styles = if shared == StyleChain::default() {
|
||||
styles.unwrap_or_default()
|
||||
} else {
|
||||
shared
|
||||
};
|
||||
let page = PageNode::new(FlowNode::new(flow.to_vec()).pack()).pack();
|
||||
let stored = self.scratch.content.alloc(page);
|
||||
self.accept(stored, styles)?;
|
||||
|
@ -39,14 +39,14 @@ cast_from_value! {
|
||||
/// Display: Query
|
||||
/// Category: special
|
||||
#[node(Locatable, Show)]
|
||||
pub struct QueryNode {
|
||||
struct QueryNode {
|
||||
/// The thing to search for.
|
||||
#[required]
|
||||
pub target: Selector,
|
||||
target: Selector,
|
||||
|
||||
/// The function to format the results with.
|
||||
#[required]
|
||||
pub format: Func,
|
||||
format: Func,
|
||||
}
|
||||
|
||||
impl Show for QueryNode {
|
||||
@ -58,7 +58,6 @@ impl Show for QueryNode {
|
||||
let id = self.0.stable_id().unwrap();
|
||||
let target = self.target();
|
||||
let (before, after) = vt.introspector.query_split(target, id);
|
||||
let func = self.format();
|
||||
Ok(func.call_vt(vt, [before.into(), after.into()])?.display())
|
||||
Ok(self.format().call_vt(vt, [before.into(), after.into()])?.display())
|
||||
}
|
||||
}
|
||||
|
@ -5,7 +5,7 @@ use std::ops::{Add, AddAssign};
|
||||
use ecow::{eco_format, EcoString, EcoVec};
|
||||
|
||||
use super::{ops, Args, Func, Value, Vm};
|
||||
use crate::diag::{bail, At, SourceResult, StrResult};
|
||||
use crate::diag::{At, SourceResult, StrResult};
|
||||
use crate::util::pretty_array_like;
|
||||
|
||||
/// Create a new [`Array`] from values.
|
||||
@ -139,9 +139,6 @@ impl Array {
|
||||
|
||||
/// Return the first matching element.
|
||||
pub fn find(&self, vm: &mut Vm, func: Func) -> SourceResult<Option<Value>> {
|
||||
if func.argc().map_or(false, |count| count != 1) {
|
||||
bail!(func.span(), "function must have exactly one parameter");
|
||||
}
|
||||
for item in self.iter() {
|
||||
let args = Args::new(func.span(), [item.clone()]);
|
||||
if func.call_vm(vm, args)?.cast::<bool>().at(func.span())? {
|
||||
@ -153,9 +150,6 @@ impl Array {
|
||||
|
||||
/// Return the index of the first matching element.
|
||||
pub fn position(&self, vm: &mut Vm, func: Func) -> SourceResult<Option<i64>> {
|
||||
if func.argc().map_or(false, |count| count != 1) {
|
||||
bail!(func.span(), "function must have exactly one parameter");
|
||||
}
|
||||
for (i, item) in self.iter().enumerate() {
|
||||
let args = Args::new(func.span(), [item.clone()]);
|
||||
if func.call_vm(vm, args)?.cast::<bool>().at(func.span())? {
|
||||
@ -169,9 +163,6 @@ impl Array {
|
||||
/// Return a new array with only those elements for which the function
|
||||
/// returns true.
|
||||
pub fn filter(&self, vm: &mut Vm, func: Func) -> SourceResult<Self> {
|
||||
if func.argc().map_or(false, |count| count != 1) {
|
||||
bail!(func.span(), "function must have exactly one parameter");
|
||||
}
|
||||
let mut kept = EcoVec::new();
|
||||
for item in self.iter() {
|
||||
let args = Args::new(func.span(), [item.clone()]);
|
||||
@ -184,9 +175,6 @@ impl Array {
|
||||
|
||||
/// Transform each item in the array with a function.
|
||||
pub fn map(&self, vm: &mut Vm, func: Func) -> SourceResult<Self> {
|
||||
if func.argc().map_or(false, |count| !(1..=2).contains(&count)) {
|
||||
bail!(func.span(), "function must have one or two parameters");
|
||||
}
|
||||
let enumerate = func.argc() == Some(2);
|
||||
self.iter()
|
||||
.enumerate()
|
||||
@ -203,9 +191,6 @@ impl Array {
|
||||
|
||||
/// Fold all of the array's elements into one with a function.
|
||||
pub fn fold(&self, vm: &mut Vm, init: Value, func: Func) -> SourceResult<Value> {
|
||||
if func.argc().map_or(false, |count| count != 2) {
|
||||
bail!(func.span(), "function must have exactly two parameters");
|
||||
}
|
||||
let mut acc = init;
|
||||
for item in self.iter() {
|
||||
let args = Args::new(func.span(), [acc, item.clone()]);
|
||||
@ -216,9 +201,6 @@ impl Array {
|
||||
|
||||
/// Whether any element matches.
|
||||
pub fn any(&self, vm: &mut Vm, func: Func) -> SourceResult<bool> {
|
||||
if func.argc().map_or(false, |count| count != 1) {
|
||||
bail!(func.span(), "function must have exactly one parameter");
|
||||
}
|
||||
for item in self.iter() {
|
||||
let args = Args::new(func.span(), [item.clone()]);
|
||||
if func.call_vm(vm, args)?.cast::<bool>().at(func.span())? {
|
||||
@ -231,9 +213,6 @@ impl Array {
|
||||
|
||||
/// Whether all elements match.
|
||||
pub fn all(&self, vm: &mut Vm, func: Func) -> SourceResult<bool> {
|
||||
if func.argc().map_or(false, |count| count != 1) {
|
||||
bail!(func.span(), "function must have exactly one parameter");
|
||||
}
|
||||
for item in self.iter() {
|
||||
let args = Args::new(func.span(), [item.clone()]);
|
||||
if !func.call_vm(vm, args)?.cast::<bool>().at(func.span())? {
|
||||
|
@ -343,12 +343,7 @@ impl Debug for Transform {
|
||||
cast_from_value! {
|
||||
Transform,
|
||||
content: Content => Self::Content(content),
|
||||
func: Func => {
|
||||
if func.argc().map_or(false, |count| count != 1) {
|
||||
Err("function must have exactly one parameter")?
|
||||
}
|
||||
Self::Func(func)
|
||||
},
|
||||
func: Func => Self::Func(func),
|
||||
}
|
||||
|
||||
/// A chain of style maps, similar to a linked list.
|
||||
@ -494,6 +489,15 @@ impl<'a> StyleChain<'a> {
|
||||
})
|
||||
}
|
||||
|
||||
/// Convert to a style map.
|
||||
pub fn to_map(self) -> StyleMap {
|
||||
let mut suffix = StyleMap::new();
|
||||
for link in self.links() {
|
||||
suffix.0.splice(0..0, link.iter().cloned());
|
||||
}
|
||||
suffix
|
||||
}
|
||||
|
||||
/// Iterate over the entries of the chain.
|
||||
fn entries(self) -> Entries<'a> {
|
||||
Entries { inner: [].as_slice().iter(), links: self.links() }
|
||||
|
@ -163,7 +163,7 @@
|
||||
#test((1, 2, 3, 4).fold(0, (s, x) => s + x), 10)
|
||||
|
||||
---
|
||||
// Error: 20-30 function must have exactly two parameters
|
||||
// Error: 20-22 unexpected argument
|
||||
#(1, 2, 3).fold(0, () => none)
|
||||
|
||||
---
|
||||
|
Loading…
x
Reference in New Issue
Block a user