Flowmetalby

A shining mercurial metal, laden with sensors and almost infinitely reconfigurable. The stuff of which robots and servitors are made.

Flowmetal is a new clustered scripting platform, designed to make it easy to write high reliability automation (usually called workflows) in networked environments.

Some example applications:

  • 🗹 Implementing CI flows
  • 🗹 Reacting to webhooks
  • 🗹 Integrating between RESTful systems
  • 🗹 Orchestrating batch jobs (via REST or other APIs)

Key Features

  • Simple, Familiar Syntax - No new DSL notation to learn. Just a Python subset (Starlark).
  • Scripting ease - Just flowmetal run ./script.flow, no need to wait for a scheduled deployments or slow builds.
  • Scripts are services - No need to repackage anything, just spawn a worker to live forever.
  • Composable by default - As in Python, scripts are libraries and flows are just functions to be called and reused.
  • Designed for multitenancy - Resource quotas and isolation are in the box.
  • Reliability for free - Flowmetal’s design ensures that flows cannot get lost or stuck.
  • Easy to operate - Failure oblivious flows allow the grid to scale up, down and undergo maintenance effortlessly without interruptions.

Example: Hello, world

hello = flow(
    implementation = _hello_impl,
    # There are three default arguments.
    #  - _stdout, _stderr (unbounded writable inbox)
    #  - _stdin  (readable inbox)
    args = {
        "template": args.string(
            default = "Hello from Flowmetal, {user}!",
        )
    },
    resources = {},
    # Flows can consume secrets.
    # Secrets are inherited from the root flow.
    # We don't need any.
    secrets = {},
)

def _hello_impl(fctx: FlowContext):
    # Flows access the outside world via actions,
    # which are built in to the runtime.
    fctx.actions.write(
        fctx.args._stdout,
        fctx.args.template.format(
            user=fctx.quota.username,
        ),
        # Stdout is unbounded so the write won't block
        # But writes can choose to time out instead
        timeout = None,
    )

if __name__ == "__flow___":
    # Args may be specified by keyword
    # _stdout etc are inherited
    hello(
        template = "Hello, {user}! I'm Flowmetal!"
    )

How It Works

             +--------------+
 Users  <--> |  Client API  |
             +--------------+
                    |
                    V
             +--------------+
             |   Flow DB    | <--> Connectors
             +--------------+
                    ^
                    |
             +--------------+
Workers <--> |  Worker API  |
             +--------------+

Flowmetal itself is a grid execution environment consisting of a user-facing API service, a database, a worker-facing API service and a fleet of stateless workers.

If we flowmetal run --context=default ./hello.flow, the Flowmetal CLI assembles this script and its dependencies into a bundle, uploads that bundle to the Flowmetal grid and requests that the flow be instantiated there under the role and quota of the connected user.

According to the authenticated user’s quota, the requested flow will be picked up by a worker, which downloads the flow bundle and execute the starlark it contains. The bundling strategy is designed to make both uploads and downloads incremental, both for speed and for transit efficiency.

Starlark execution happens with normal Python script semantics. The __name__ global is bound to __flow__, and a global flow context is established.

All the fctx.actions.* calls which a flow performs are written into a commit log as requests. If the action is blocking, then the flow is suspended until a respones is received or a timeout occurs. Responses are inserted into the same log by the Flowmetal runtime.

Log updates are committed back to the Flowmetal platform before any side-effects are performed. This means it is always safe for Flowmetal to restart any given flow from its last committed log state. The worst case is that the last requested action went awry and was lost, which Flowmetal can clean up by declaring it a failure and producing an error response to that request.

Most actions such as reading the clock, sleeping, communicating between flows and HTTP verbs are built in, but Connector services can provide specialized custom actions by listening to the job logs and asynchronously returning results.

Workers may choose to suspend the current flow after a result, returning it to the database until another worker picks it back up. Flows waiting on asynchronous results such as sleeps or blocking on another Flow will also be returned to the database.

We are indebted to Meiklejohn et. all for A.M.B.R.O.S.I.A., which describes this approach in detail.

I would like to learn more!

We’ve prepared some other examples showcasing features that don’t fit here

Can I try it?

While these ideas have had years of refinement, Flowmetal is in early stages of development.

What you see at present are concepts written as exercises in hammering out Flowmetal’s Starlark interface.

We hope to have an interest mailing list and more open development process soon.

In the meantime feedback and inquiries may be directed to flowmetal AT tirefireind DOT us. We’d love to hear what you’d like to use a platform like this for, or any commentary on the DSL!