Skip to main content

Worker architecture

Overview​

The backend architecture to support running customer automations in workers (instead of in guided mode, running in the customers browser) is structured as follows.

Backend automations can either run in our environment (hosted in AWS) or self-hosted by the customer, for customers that require it.

Automations potentially support 2 modes for each tool - browser-based automation (either unauthenticated or using user credentials) and API-based automation (using either API key or application credentials).

Note: to give a broader scope of the architecture this overview contains planned future capabilities related to running workers in customer environment, which are not yet available. Those capabilities are marked as '(planned)' for clarity.

Detailed flow​

Authentication​

  • Running automations in the backend requires the customer to provide the required permissions for each tool to run in the backend. This can be basic credentials (user+password) for running in a headless browser in the worker, or credentials for running the tool using API calls (e.g API key or application) for tools that support it
  • Credentials will be stored in AWS secrets manager in our environment, or a custom secrets store for self-hosted solution (planned)

Triggering​

  • Backend automations can be triggered in one of several ways:
  1. Automatic triggering by Legion through a poller mechanism. Customers can enable polling on a tool they use (e.g new email in Outlook, new case in The Hive) which Legion will poll periodically and trigger an automation whenever a new item is discovered
  2. Manual triggering of a backend automation on a given input, either through an API call or through a webhook

Orchestration​

  • Once a triggering request is received the triggering handler (investigator) will add the investigation to the DB and add it to the triggers queue.
  • The queue allows handling spikes in the triggers volume, as well as ensuring fairness between customer (one customer generating a lot of triggers shouldn't delay triggers for other customers)
  • The worker orchestrator will read pending jobs from the queue and create a dedicated worker instance for each investigation. The worker's details are stored in MongoDB after it is created
  • Once the worker container starts running it will be sent an API call to start investigating which contains all the necessary details.
  • The worker will perform API calls to the backend using an random-generated API key assigned to it, which will allow it to behave like a client browser for running the investigation (list next steps, submit step result, etc.)