Data Driven Workflows - Workflow Specification #147
Replies: 7 comments 15 replies
-
SHORT VERSION I like the idea of using UML or some other existing specification used for class specification/relationship diagrams. I also agree that JSON is a given. LONG VERSION Additionally, I think a crappy way to define those persistence/presentation types, is simply defining those selections as individual enumerations and then defining those enums on the diagram. Example:
That gets us to a point where you can define it on the diagram and potentially process it in the code. BUT I'm sure we'll find a better way. |
Beta Was this translation helpful? Give feedback.
-
@wwt/workflow-developers The proposal for Data-Driven Workflow specifications is ready for review. Feel free to critique, add thoughts, features ideas, ask questions that the proposal didn't cover, and suggest changes. |
Beta Was this translation helpful? Give feedback.
-
I like the idea of UML and also agree on JSON and YAML - those two feel like must-haves in the industry today. |
Beta Was this translation helpful? Give feedback.
-
I am starting a new answer to cover possible JSON schemas. Here is my suggested schema: {
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"version": {
"type": "string"
},
"sequence": {
"type": "array",
"items": [
{
"type": "object",
"properties": {
"flowRepresentableName": {
"type": "string"
},
"flowPersistence": {
"type": "object",
"properties": {
"style": {
"type": "string"
}
},
"required": [ "style" ]
},
"launchStyle": {
"type": "object",
"properties": {
"style": {
"type": "string"
},
"substyle": {
"type": "string"
}
},
"required": [ "style" ]
}
},
"required": [ "flowRepresentableName" ]
}
]
}
},
"required": [
"version",
"sequence"
]
} This one defines the minimum amount of information needed to define a workflow. It has a {
"version": "0.0.1",
"sequence": [
{
"flowRepresentableName": "FR1"
}
]
} While simple, it allows for growth without breaking the existing schema. For example, because FlowPersistence is an object and not just a string value, you can define additional properties in FlowPersistence down the road. Perhaps you do something like this: {
"version": "0.0.1",
"sequence": [
{
"flowRepresentableName": "FR1",
"flowPersistence": {
"type": "conditional",
"condition": "args == 10"
}
}
]
} While that condition may not be what we ultimately want, it allows us to figure that out without a breaking change to the schema. That leads to a critical consideration for this schema. I wanted it so that this schema would continue to make sense across a potential Android or Web consumer and not be confusing in those contexts. Hopefully, it will make sense in any context on any platform. |
Beta Was this translation helpful? Give feedback.
-
Capturing some discussion from @Tyler-Keith-Thompson and I that we could potentially use a build/run phase to validate the provided json would actually work with the code that you have. Not sure on specifics just yet, but having some ability to fail earlier than runtime would be nice. |
Beta Was this translation helpful? Give feedback.
-
PlantUML seems very interesting. I am wondering how much we're willing to take on in regards to allowing it to be a data definition. For json and YAML we get kinda lucky and have a schema we can use to validate. We also get parsers pretty cheap. For PlantUML we'd need to write both a schema and a validator/parser. PlantUML seems to also enable a ton of features and I'm wondering how we might decide what to accept and what not to. Should we make a parser that is spec compliant or only enable the features we really need? Just noodling here before diving into trying to make a schema and parser. |
Beta Was this translation helpful? Give feedback.
-
At the moment we validate this object with our schema with no issues:{
"schemaVersion": "v0.0.1",
"sequence": [
{
"flowRepresentableName": "FR1"
},
{
"flowRepresentableName": "FR2",
"launchStyle": "modal",
"flowPersistence": "removedAfterProceeding"
},
{
"flowRepresentableName": {
"watchOS": "FR3",
"macOS": "FR3",
"iOS": "FR3",
"iPadOS": "FR3",
"tvOS": "FR3",
"android": "FRA3"
},
"launchStyle": {
"watchOS": "modal",
"macOS": "modal",
"iOS": "modal",
"iPadOS": "popover",
"tvOS": "modal",
"android": "widget"
},
"flowPersistence": {
"watchOS": "removedAfterProceeding",
"macOS": "removedAfterProceeding",
"iOS": "removedAfterProceeding",
"iPadOS": "removedAfterProceeding",
"tvOS": "removedAfterProceeding",
"android": "somethingElse"
}
},
{
"flowRepresentableName": {
"*": "FR3",
"android": "FRA3"
},
"launchStyle": {
"*": "modal",
"iPadOS": "popover",
"android": "widget"
},
"flowPersistence": {
"watchOS": "removedAfterProceeding",
"macOS": "removedAfterProceeding",
"iOS": "removedAfterProceeding",
"iPadOS": "removedAfterProceeding",
"tvOS": "removedAfterProceeding",
"android": "somethingElse"
}
}
]
} But should we also enable something like this where we mix strings and objects:In other words, should we enable something that allows mixing of strings and objects or should we enforce that a user chooses one or the other. I'm thinking we'd allow mixing, but would like to know what the others think. |
Beta Was this translation helpful? Give feedback.
-
Our milestone calls out the desire for data-driven workflows. This proposal walks through what the specification can be/should be.
What is the Specification Describing?
Ultimately, data-driven workflows describe several important things:
Out of Scope
FlowRepresentable
should loadFlowRepresentable
s from dataValid Specs
A specification (in any format) should be considered valid if:
FlowRepresentable
s listed actually exist in the codebaseFlowRepresentable.WorkflowOutput
of a given item matches theFlowRepresentable.WorkflowInput
of the next itemFlowRepresentable.WorkflowInput
of the first itemFlowRepresentable
is one of the validFlowPersistence
typesFlowRepresentable
is one of the validLaunchStyle
typesThis has several implications:
In order to validate a given specification, we need access to the codebase that contains the
FlowRepresentables
defined. I can see 2 paths forward.Run-time validation
We could simply perform validation of a workflow specification at runtime. This is necessary regardless of any other choices we make in this proposal. The most reasonable way to handle this is probably something like a throwing initializer for
Workflow
and/orWorkflowLauncher
that takes in some decodable data. This means consumers are forced to deal with the possibility that a workflow can not be decoded. It falls in line nicely with other codable/decodable concepts and shouldn't cause too much consternation.Pre-compile validation
This may be a later feature that we add. The idea here is that if someone is designing workflows from data they want to validate their specification. This is achievable if we create some kind of intermediary representation of
FlowRepresentable
s in the codebase. This idea is hopelessly inspired by contract testing with pact.io. Imagine 2 responsible parties:Consumers
Consumers can start by running a SwiftCurrent provided tool. This probably makes the most sense as a CLI utility. That could use SwiftSyntax to look through the AST output of a given Swift application and find all
FlowRepresentable
types in the codebase. This means that it may not be able to find statically linked symbols. The limitation of statically linked symbols needs investigation. Even if true, it's not that much of a concern as the vast majority of SwiftCurrent users will have access to the code such that the AST explorer could do its thing.After exploring the AST and discovering all
FlowRepresentable
s in the codebase the CLI could output some intermediary format. Bonus points if that format is not something we create, for example, it might be able to just give literal AST output in JSON form. Whenever a provider needs to validate their data they'd need that JSON file.Providers
Providers would require some intermediary representation of available
FlowRepresentable
s from consumers. They would also create one specification per workflow, not one specification for all workflows. Those don't necessarily have to be different documents but the specification, as noted above, is just for one workflow.The provider could run a linter on their data using the intermediary representation from consumers. The linter would follow the validation rules listed above. Providers would need updated documents from consumers any time new
FlowRepresentable
s became available for use. It's then crucial that consumer representations are versioned, so providers have all the information they need.Formats
It's first worth noting that there does not have to be a single format. It seems prudent to limit the number because each format is something that we have to maintain, but abstractions in Swift (and most languages, for that matter) mean that it makes little difference if the format is JSON, YAML, TOML, XML, Protobuf, UML, INI, or some other insane format we want to throw out. This proposal is going to walk through common formats, pros and cons, and make a recommendation on a per-format basis on whether we should adopt it for the workflow specification.
JSON
This is certainly an industry standard and the go-to format for most engineers.
Pros
Cons
Adopt: Despite its flaws for workflow description, JSON is such an industry standard that we'd be foolish not to adopt it.
PlantUML
Okay, I know how insane this sounds. However, hear me out here. UML is great at describing sequences. Workflows are sequences. What if your documentation for describing flows in your apps could literally be used to create flows in your apps? I think this is worth investigating.
Pros
Cons
Adopt: Look, I recognize how insane the idea is. However, right now the out-of-the-box thinking is really speaking to me. Documentation Driven Development is a great idea, it's just that manual efforts make it unreasonable sometimes. We can automate that in really interesting ways.
YAML
YAML is quickly becoming the go-to format for configuration. It focuses more on human readability and has more readily available tooling for non-developers than JSON.
Pros
Cons
Adopt: YAML feels like a given. The white-labeling crowd will probably be appreciative. This recommendation comes with an asterisk. If you can only adopt one format it should not be YAML. This is because it is not friendly for servers to send to clients.
Protobuf
Google's protobuf implementation is fantastic. You get safe, performant, and code-friendly serialization and deserialization.
Pros
Cons
Reject: Protobuf is frankly amazing. I would welcome community contributions but I don't see enough benefit for us to provide 1st party support for it out of the box.
XML
XML may be an older format but it's perfectly positioned to describe workflows. It has the concept of nodes, it can self-references, you could easily describe a doubly linked list in XML.
Pros
Cons
Reject: XML has some surprising advantages here, but once again it's probably not worth 1st party support. If for some reason, we get community interest I could be persuaded otherwise. Any community contributions would be welcomed.
TOML
TOML was designed to be an even easier YAML. It shares similarities with INI files but has a more standardized format and supports nesting. Ultimately it's just a big hashmap.
Pros
Cons
Reject: I see no real benefits over YAML for workflow specifications. Just like the other rejections, community contributions would be welcomed. However, first-party support seems unnecessary.
Beta Was this translation helpful? Give feedback.
All reactions