Low-Level API

Syside splits models into chunks, or Documents, each corresponding to a single source file. While this is partly to support editor applications (LSP) that must work on a per-source-file basis, it also provides a sensible model splitting for multithreading. To support multithreading in editor applications, each Document and TextDocument are protected by a mutex using SharedMutex. Python API has context-manager wrapper with automatic acquire and release:

with mutex.lock() as document:
    # mutex acquired here
    ...

Internally, it is a type-erased wrapper to a shared mutex-like object that provides shared accesses for read-only operations, and unique accesses for write operations. Type-erasure allows identical interfaces for both single-threaded (noop mutex), and multi-threaded (shared mutex) objects, which may prove beneficial if in the future builds for free-threaded Python were offered.

Unfortunately, Python does not have read-only semantics, thus SharedMutex is equivalent to a regular mutex – all accesses are unique, unless it is a noop mutex, e.g. from parse_string_st.

Additionally, each Document acts as a memory resource for its owned nodes (elements) – this improves memory usage, and enables incredibly useful and efficient nodes and all_nodes methods. However, this does prevent moving nodes from one document to another, but that is a small price to pay for the performance benefits.

Pipelines

Underneath everything, Documents in Syside are built by a Pipeline. At a high level, this is a sequence of

%3 create pipeline (reusable) create pipeline (reusable) schedule schedule create pipeline (reusable)->schedule execute execute schedule->execute

A Pipeline is constructed with

Note

Current Python API is limited, and does not allow additional validation rules in the pipeline.

Constructed Pipelines are used to create build Schedules that are completed on an Executor (pool of worker threads):

schedule: syside.Schedule = pipeline.schedule(
    documents,
    options=syside.ScheduleOptions(
        validation_timing=syside.ValidationTiming.Manual
    ),
    invalidated=[],
)
result: syside.ExecutionResult = syside.get_default_executor().run(schedule)

Note

Executor.run consumes the passed-in schedule – attempting to access its attributes afterwards will raise a RuntimeError. Instead, a cleared Schedule is returned in result.schedule. This should be made reusable for scheduling with fewer allocations in a future release.

Note

Executors are thread-pools underneath, therefore due the runtime cost of starting new threads, they should be reused as much as possible. Syside provides a default executor with syside.get_default_executor().

Internally, schedules are a sequence of build stages, some of which can run in parallel. Parallelism is an implementation detail, and will use worker threads from Executor.

%3 Parse Parse AST AST Parse->AST Indexing Indexing AST->Indexing Caching Caching Indexing->Caching Sema Sema Caching->Sema Validation Validation Sema->Validation

Each completed stage will update BasicDocument.build_state to a corresponding value, and will skip documents that have already been completed previously. Note that the Schedule is safe to execute on multiple threads without requiring explicit synchronization of separate documents – this is achieved with build dependencies between both documents and stages.

Parse & AST

This is the very first stage in the pipeline that gets run and is responsible for parsing the source and constructing an initial AST (not including implied relationships).

assert document.build_state == syside.BuildState.Parsed

# complete schedule here with
options = syside.ScheduleOptions(
    validation_timing=syside.ValidationTiming.Manual,
    cutoff=syside.BuildState.Parsed,
)

This can also be directly achieved with

mutex: syside.SharedMutex[syside.Document]
diagnostics: list[syside.Diagnostic]
# single-threaded - noop mutex
mutex, diagnostics = syside.Document.parse_string_st(
    "package P;", syside.ModelLanguage.SysML
)

# multi-threaded - mutex
mutex, diagnostics = syside.Document.parse_string_mt(
    "package P;", syside.ModelLanguage.KerML
)

If the source has changed, e.g. through TextDocument.update, set document.build_state = syside.BuildState.Changed to force the pipeline to reparse the source and rebuild the AST.

Indexing & Caching

Indexing is executed immediately after constructing the AST. It is used to resolve cross-document references in the sema stage.

Caching is dependent on all related documents having completed indexing and thus acts as a synchronization barrier – a single cache is assumed per language. In SysML and KerML, caching populates a shared Stdlib.

assert document.build_state == syside.BuildState.Indexed

# complete schedule here with
options = syside.ScheduleOptions(
    validation_timing=syside.ValidationTiming.Manual,
    cutoff=syside.BuildState.Indexed,
)

Indexing is possible manually:

Caching, too:

Re-index documents in the pipeline with document.build_state = syside.BuildState.Parsed or lower. This is only really needed if named members in document.root_node have changed – the reference resolution otherwise is a walk through the model graph.

Sema

This is the semantic resolution stage, and is responsible for resolving references and specification semantic constraints (starting with check in the specification). Sema requires caching to have completed.

assert document.build_state == syside.BuildState.Built

# complete schedule here with
options = syside.ScheduleOptions(
    validation_timing=syside.ValidationTiming.Manual,
    cutoff=syside.BuildState.Built,
)

Sema can be performed manually:

This is the most important stage in SysML pipeline, and additionally, its actions are dependent on relationships between elements. Moreover, because of abundance of cycles and unclear cause-and-effect relationships in SysML, sema automatically updates Element.sema_state and ignores elements with element.sema_state != syside.SemaState.none. If a model is expected to be modified, prefer parsing the initial AST with parse_string_st or parse_string_mt to avoid having to discard sema results. Otherwise, sema can be reset with

# for a single element
syside.sema_reset(element)

# or the whole document
syside.sema_reset(document)

The latter additionally resets resolved references back to placeholder values so that sema has to do reference resolution again. Note that resetting sema on a document will additionally ensure that the resolved references in the model match the resolved references from the source before resetting them back to placeholders – any mismatches will be reported through the last reporter callable parameter, these references will not be touched.

Re-run sema in the pipeline with document.build_state = syside.BuildState.Indexed or lower. Note that doing so should also be applied to any dependent documents as well. Additionally, documents can be rebuilt from the sema stage by passing them as invalidated in Pipeline.schedule.

Validation

This is the last stage in the pipeline and is responsible for validating the model – checking that validation constraints (starting with validate) defined in the specification are satisfied. This stage runs after documents have been semantically resolved. Note that validation does not usually check standard check... constraints as they should be enforced by the model or semantic resolution.

Note

Before adding non-standard lint rules, diagnostic pragmas need to be implemented that can disable unwanted diagnostics.

assert document.build_state == syside.BuildState.Validated

# complete schedule here with (default, no real effect)
options = syside.ScheduleOptions(
    validation_timing=syside.ValidationTiming.Manual,
    cutoff=syside.BuildState.Validated,
)

Already validated documents can be revalidated in the pipeline with

schedule = pipeline.schedule(
    [document],
    options=syside.ScheduleOptions(
        validation_timing=syside.ValidationTiming.Manual,
        force_revalidation=True,
    ),
)

# or
with document.lock() as doc:
    doc.build_state = syside.BuildState.Built  # or lower

Pipeline scheduling contains additional validation options:

  • validation_tier controls the level of document_tier that are validated, skipping documents with lower tiers. This is primarily used in editor applications where users are not usually concerned with diagnostics from the standard and external libraries. For example, setting validation_tier = syside.DocumentTier.External will validate documents with document_tier in (syside.DocumentTier.External, syside.DocumentTier.Project) but not documents with document_tier == syside.DocumentTier.StandardLibrary.

  • validation_timing controls the cost level of validation rules that run, e.g. validation_timing == syside.ValidationTiming.Manual will run all validation rules, while validation_timing == syside.ValidationTiming.OnType – those cheap enough to run on every key stroke. At the moment, all built-in validation rules are cheap enough to run OnType but that may change in the future. Note that validation_timing == syside.ValidationTiming.Never will effectively skip validation stage.