Low-Level API
Syside splits models into chunks, or Documents, each
corresponding to a single source file. While this is partly to support editor
applications (LSP) that must work on a per-source-file basis, it also provides a
sensible model splitting for multithreading. To support multithreading in editor
applications, each Document and TextDocument are protected by a mutex using SharedMutex. Python API has context-manager wrapper with automatic acquire and
release:
with mutex.lock() as document:
# mutex acquired here
...
Internally, it is a type-erased wrapper to a shared mutex-like object that provides shared accesses for read-only operations, and unique accesses for write operations. Type-erasure allows identical interfaces for both single-threaded (noop mutex), and multi-threaded (shared mutex) objects, which may prove beneficial if in the future builds for free-threaded Python were offered.
Unfortunately, Python does not have read-only semantics, thus SharedMutex is equivalent to a regular mutex – all accesses are unique,
unless it is a noop mutex, e.g. from parse_string_st.
Additionally, each Document acts as a memory resource for
its owned nodes (elements) – this improves memory usage, and enables incredibly useful
and efficient nodes and all_nodes methods. However, this does prevent moving nodes from one
document to another, but that is a small price to pay for the performance benefits.
Pipelines
Underneath everything, Documents in Syside are built by a
Pipeline. At a high level, this is a sequence of
A Pipeline is constructed with
pipeline: syside.Pipeline = syside.make_pipeline(
syside.PipelineOptions(lib=None, static_index=None)
)
Note
Current Python API is limited, and does not allow additional validation rules in the pipeline.
Constructed Pipelines are used to create build
Schedules that are completed on an Executor (pool of worker threads):
schedule: syside.Schedule = pipeline.schedule(
documents,
options=syside.ScheduleOptions(
validation_timing=syside.ValidationTiming.Manual
),
invalidated=[],
)
result: syside.ExecutionResult = syside.get_default_executor().run(schedule)
Note
Executor.run consumes the passed-in schedule –
attempting to access its attributes afterwards will raise a RuntimeError.
Instead, a cleared Schedule is returned in
result.schedule. This should be made
reusable for scheduling with fewer allocations in a future release.
Note
Executors are thread-pools underneath, therefore due
the runtime cost of starting new threads, they should be reused as much as possible.
Syside provides a default executor with syside.get_default_executor().
Internally, schedules are a sequence of build stages, some of which can run in parallel.
Parallelism is an implementation detail, and will use worker threads from
Executor.
Each completed stage will update BasicDocument.build_state to a corresponding value, and will skip documents
that have already been completed previously. Note that the Schedule is safe to execute on multiple threads without requiring explicit
synchronization of separate documents – this is achieved with build dependencies
between both documents and stages.
Parse & AST
This is the very first stage in the pipeline that gets run and is responsible for parsing the source and constructing an initial AST (not including implied relationships).
assert document.build_state == syside.BuildState.Parsed
# complete schedule here with
options = syside.ScheduleOptions(
validation_timing=syside.ValidationTiming.Manual,
cutoff=syside.BuildState.Parsed,
)
This can also be directly achieved with
mutex: syside.SharedMutex[syside.Document]
diagnostics: list[syside.Diagnostic]
# single-threaded - noop mutex
mutex, diagnostics = syside.Document.parse_string_st(
"package P;", syside.ModelLanguage.SysML
)
# multi-threaded - mutex
mutex, diagnostics = syside.Document.parse_string_mt(
"package P;", syside.ModelLanguage.KerML
)
If the source has changed, e.g. through TextDocument.update, set document.build_state = syside.BuildState.Changed
to force the pipeline to reparse the source and rebuild the AST.
Indexing & Caching
Indexing is executed immediately after constructing the AST. It is used to resolve cross-document references in the sema stage.
Caching is dependent on all related documents having completed indexing and thus acts as
a synchronization barrier – a single cache is assumed per language. In SysML and KerML,
caching populates a shared Stdlib.
assert document.build_state == syside.BuildState.Indexed
# complete schedule here with
options = syside.ScheduleOptions(
validation_timing=syside.ValidationTiming.Manual,
cutoff=syside.BuildState.Indexed,
)
Indexing is possible manually:
Caching, too:
lib = syside.Stdlib(index)
assert lib.all_complete
Re-index documents in the pipeline with document.build_state =
syside.BuildState.Parsed or lower. This is only really needed if named members in
document.root_node have changed – the reference resolution otherwise is a walk
through the model graph.
Sema
This is the semantic resolution stage, and is responsible for resolving references and
specification semantic constraints (starting with check in the specification). Sema
requires caching to have completed.
assert document.build_state == syside.BuildState.Built
# complete schedule here with
options = syside.ScheduleOptions(
validation_timing=syside.ValidationTiming.Manual,
cutoff=syside.BuildState.Built,
)
Sema can be performed manually:
syside.Sema().resolve([document], index, lib)
This is the most important stage in SysML pipeline, and additionally, its actions are
dependent on relationships between elements. Moreover, because of abundance of cycles
and unclear cause-and-effect relationships in SysML, sema automatically updates
Element.sema_state and ignores elements with
element.sema_state != syside.SemaState.none. If a model is expected to be modified,
prefer parsing the initial AST with parse_string_st or parse_string_mt to avoid having to discard sema results. Otherwise,
sema can be reset with
# for a single element
syside.sema_reset(element)
# or the whole document
syside.sema_reset(document)
The latter additionally resets resolved references back to placeholder values so that
sema has to do reference resolution again. Note that resetting sema on a document
will additionally ensure that the resolved references in the model match the resolved
references from the source before resetting them back to placeholders – any mismatches
will be reported through the last reporter callable parameter, these references will
not be touched.
Re-run sema in the pipeline with document.build_state = syside.BuildState.Indexed or
lower. Note that doing so should also be applied to any dependent documents as well.
Additionally, documents can be rebuilt from the sema stage by passing them as
invalidated in Pipeline.schedule.
Validation
This is the last stage in the pipeline and is responsible for validating the model –
checking that validation constraints (starting with validate) defined in the
specification are satisfied. This stage runs after documents have been semantically
resolved. Note that validation does not usually check standard check... constraints
as they should be enforced by the model or semantic resolution.
Note
Before adding non-standard lint rules, diagnostic pragmas need to be implemented that can disable unwanted diagnostics.
assert document.build_state == syside.BuildState.Validated
# complete schedule here with (default, no real effect)
options = syside.ScheduleOptions(
validation_timing=syside.ValidationTiming.Manual,
cutoff=syside.BuildState.Validated,
)
Already validated documents can be revalidated in the pipeline with
schedule = pipeline.schedule(
[document],
options=syside.ScheduleOptions(
validation_timing=syside.ValidationTiming.Manual,
force_revalidation=True,
),
)
# or
with document.lock() as doc:
doc.build_state = syside.BuildState.Built # or lower
Pipeline scheduling contains additional validation options:
validation_tiercontrols the level ofdocument_tierthat are validated, skipping documents with lower tiers. This is primarily used in editor applications where users are not usually concerned with diagnostics from the standard and external libraries. For example, settingvalidation_tier = syside.DocumentTier.Externalwill validate documents withdocument_tier in (syside.DocumentTier.External, syside.DocumentTier.Project)but not documents withdocument_tier == syside.DocumentTier.StandardLibrary.validation_timingcontrols the cost level of validation rules that run, e.g.validation_timing == syside.ValidationTiming.Manualwill run all validation rules, whilevalidation_timing == syside.ValidationTiming.OnType– those cheap enough to run on every key stroke. At the moment, all built-in validation rules are cheap enough to runOnTypebut that may change in the future. Note thatvalidation_timing == syside.ValidationTiming.Neverwill effectively skip validation stage.