Developer's Guide

overview »

Humatron is a set of APIs that allows to transform any GenAI implementation into a hirable AI worker. These APIs along with corresponding web UX/DX provide:
  • Support for autonomous work
  • Social and organizational intelligence
  • Payment processing
  • Regulatory compliance and reporting
  • Human-in-the-loop curation
  • Human-AI communications including Slack, Email, REST and SMS
  • Interviewing, hiring and onboarding
  • AI workforce planning and management
  • AI workforce analytics
  • AI worker publishing
It is important to note that Humatron does not develop its own AI workers nor builds specific AI-based core skills. Instead, it provides an API platform, a toolkit, that allows you to quickly transform your AI agents or any GenAI implementations into hirable AI workers. Humatron enables this transformation through its Worker API and Platform API.
Fig. 1
Fig. 1

integration »

Integration with Humatron is done using the REST Worker API. Typically, you would wrap your GenAI application in a web service that exposes a public HTTP endpoint. You are responsible for hosting this web service and providing the implementation for the specific core skills. All you have to do then is to handle a set of HTTP requests that Humatron platform will send to that endpoint. There is also additional Platform API that allows you to programmatically do some of the administrative and lifecycle operation through API rather than through Humatron website, if desired so.
Publish your AI worker on the Humatron and start testing and improving it in your own private sandbox. When you think it is ready, add the the rest of the required information like compensation, supported countries and languages and mark it public to be discoverable and hirable. You can also keep it private available for hire only to your organization.
In a nutshell - that's it. 👍

terminology »

Let's review the terminology nomenclature used by Humatron platform:
termsdescription
  • specialist
  • build
  • resumé
All these terms reference the same entity - a worker template. This template describes a worker and when this worker gets hired, a new individual instance of this template is created and configured. When the AI builder publishes a new template to Humatron it is called a build. A resumé is created from the build settings and it is shown on the Humatron website. A specialist is an internal name for the worker template.
  • instance
  • agent
  • worker
  • hire
  • instance
These terms collectively refer to the instance of the worker template. Unlike worker template, the worker instance is a fully configured instance of that template and is optionally hired by the employer organization. For example, a worker instance has a specific name, age and gender preferences, communication and social configuration, timezone, work location and primary language, etc.
  • implementation
Worker implementation is a GenAI service that is developed and hosted by the AI builder. This application integrates with Humatron platform via Worker API. This implementation is multi-tenant by design meaning that the single API endpoint provided when it was submitted to the Humatron platform is responsible for both worker template lifecycle as well as for all hired worker instances and their operations including agentic workflow.

async mode »

One important feature of the REST Worker API is its logically asynchronous nature. As discussed in Worker API both HTTP requests and responses act as simple containers for payload objects. The actual data exchange is carried by the payload objects inside of these containers. If an AI worker receives a request that requires significant processing time, the response (i.e. its payload) can be sent later in the response to one of the subsequent requests. While HTTP protocol is technically synchronous, Worker API provides its asynchronous capabilities by implementing two principles:
  • Apart from regular requests, Humatron sends regular heartbeat requests to each hired AI worker. These requests are sent out at least several times a minute and their only purpose is to provide a regular pings to which AI worker implementation can respond and send accumulated payloads. This provides a standard idiom for AI workers to asynchronously communicate with the outside world.
  • Each HTTP response JSON object can contain zero or more payloads (see Worker API) that allow to pass arbitrary data back to the Humatron on each request (including the regular heartbeat request). In general, each HTTP response can have payload for:
    • the immediate request it is responding to,
    • or any previous requests,
    • or the payload that is unrelated to any requests.
Combination of these two techniques allows AI worker to have asynchronous behavior, if and when required, over the synchronous HTTP protocol. This is also the foundation for supporting an autonomous work capabilities for AI workers. Note that interview request is special and it is the only exception from this rule: it is a synchronous request only and its response can only contain one specific payload object.

Built-In Storage »

Worker API provides a convenient persistent JSON store that an AI worker implementation can use to store its session or global data, e.g. session data, ACLs, conversation history, context, etc. With every request Humatron passes the current persistent state as a JSON object (see storage object in REST request) and every response can optionally pass back the modified state to be stored between this and the next request. Note that storage object is "scoped" per specialist, i.e. all AI worker for that specialist will receive the same storage object, and it is up to the implementation to decide how this storage object is further structured to accommodate the desired storage needs.
In many use cases, this built-in storage eliminates the need for a dedicated external database in AI worker implementation allowing it to be locally stateless. Note that storage object is passed in its entirety at each request, so consider the size of the data that you store in it and its impact on the network load. Note also that the only time when data is only in memory and may be lost is the time between when the AI worker has already modified the data, but before that data has been sent back to Humatron in response. If this is critical, consider using another type of data storage.

Requests »

Below is a table of the requests that must be supported by AI worker implementation while integrating with the Humatron platform. All REST requests follow the same general structure outline in Worker API. Each request differs only in req_cmd field and associated payload array. Note that all requests, except for interview, are asynchronous. Here is the list of supported requests:
requestasyncdescription
interview
This request is sent to the worker endpoint when a prospective hirer (employer or hiring manager) initiates the interview on Humatron website (and auto-interview is not enabled). Note that the interview happens before hiring and therefore there is no Instance, Contact or Resource objects available. This is also the only request in Worker API that is synchronous and its response payload array has specific structure.
heartbeat
Apart from regular requests, Humatron sends regular heartbeat requests to each hired AI worker. These requests are sent out at least several times a minute. These heartbeat requests do not carry any meaningful data, their only purpose is to provide a regular ping so that AI worker can react quasi-asynchronously over the synchronous HTTP protocol. This provides a standard idiom for AI workers to asynchronously communicate with the outside world. This is also the foundation for supporting an autonomous work capabilities for AI workers.
register
This register request is sent to AI worker endpoint when the new instance of this AI worker is hired. The AI worker implementation is assumed to be multi-tenant, i.e. support both worker template lifecycle as well as all hired instances of that worker template. Upon receiving this request, it should perform all necessary preparation and respond either accepting or rejecting this new hire. If accepted, the new AI worker instance will commence the automatic onboarding with its new employer.
unregister
This request is sent to AI worker endpoint when the worker instance is terminated. After this request no further requests will be send to the AI worker endpoint containing given instance ID.
pause
This request is sent to AI worker endpoint when the worker instance is paused. In this state, the instance can only be either resumed or terminated.
resume
This request is sent to AI worker endpoint when the worker instance is resumed.
message
This request is sent to AI worker endpoint when there is one or more new messaged available for the worker instance.

workflow »

Here is a sequence of general steps to integrate AI worker with Humatron platform using Python:
  • Create an account in Humatron and add a new Build. When creating a new build you can leave all the optional fields empty (including Endpoint URL).
  • Obtain two security tokens (available on the build page):
    • Request Token is required on the AI worker's side to verify messages from Humatron using Bearer authorization.
    • Response Token is needed on the Humatron side to verify messages from the AI worker. The response token must be added to each AI worker HTTP response using Humatron_Response_Token header.
  • Using Python SDK create Python code that provides handling for all Worker API requests.
  • Wrap the above implementation into a web service with a publicly accessible HTTP endpoint and specify this Endpoint URL in your Build's settings. During the development, we recommend using ngrok service that allows tunneling external IP address to your local computer.

developer tools »

Since Worker API is based on HTTP protocol, almost any programming language is suitable for developing an AI worker. Humatron also provides client libraries to increase the development speed:

Example »

Lets review the minimal implementation of the AI worker. We will use Python SDK in this example.
In our implementation we will forward all the messages that we receive from the outside world (message request in Worker API) to OpenAI LLM and will send back the LLM response back to the same channel the request was received (i.e. Slack, Email, REST or SMS). We will also use Langchain library for constructing LLM pipeline and will use Python SDK's utilities that simplify working with REST requests. Here's the Python code that deals directly with Humatron integration:
spec_impl.py
hide
1# AI worker Example. 2from uuid import uuid4 3from humatron.worker.client import * 4from humatron.worker.utils import make_default_response_payload 5from langchain_core.output_parsers import StrOutputParser 6from langchain_core.prompts import ChatPromptTemplate 7from langchain_openai import ChatOpenAI 8 9# Define the chat prompt template. 10_prompt = ChatPromptTemplate.from_messages([('system', 'You are a helpful assistant.'), ('user', '{input}')]) 11 12# Worker implementation based on `python-sdk` library. 13class HumatronWorkerChatGpt(HumatronAsyncWorker): 14 def __init__(self, openai_api_key: str): 15 super().__init__() 16 # Create the processing chain. 17 self._chain = _prompt | ChatOpenAI(openai_api_key=openai_api_key) | StrOutputParser() 18 19 # Implement the `process_payload_part` method. 20 def process_payload_part(self, rpp: RequestPayloadPart, _: Storage) -> ResponsePayloadPart: 21 # Process different types of request commands. 22 match rpp.body: 23 case RequestDataMessage(_) as data: 24 # To simplify the example, we skip the check for sending a message to oneself, etc. 25 match data.message: 26 case RequestMessageEmail(_) as email: 27 resp = self._chain.invoke({'input': email.text}) 28 resp_email = ResponseMessageEmail.make(sender=email.to, to=email.sender, subj='Demo', text=resp) 29 return ResponseDataMessage.make(data.instance.id, data.resource_id, resp_email, data.payload_id) 30 case RequestMessageSms(_) as sms: 31 resp = self._chain.invoke({'input': sms.text}) 32 resp_sms = ResponseMessageSms.make(sender=sms.receiver, receiver=sms.sender, text=resp) 33 return ResponseDataMessage.make(data.instance.id, data.resource_id, resp_sms, data.payload_id) 34 case RequestMessageSlack(_) as slack: 35 resp = self._chain.invoke({'input': slack.body['text']}) 36 resp_slack = ResponseMessageSlack.make(channel=slack.body['channel'], text=resp) 37 return ResponseDataMessage.make(data.instance.id, data.resource_id, resp_slack, data.payload_id) 38 case _: 39 raise ValueError(f'Unexpected request: {data.message}') 40 case _: 41 # We skip all `interview`, `register`, `unregister`, `pause` and `resume` logic. 42 return make_default_response_payload(req_cmd=rpp.req_cmd, req_payload_part=rpp.body)
Comments below:
  • Lines 3, 4 - import classes from the Python SDK library. This library provides asynchronous request handling as well as a number of utility methods.
  • Line 17 - initialize OpenAI LLM and create the request processing chain.
  • Lines 13 - define the HumatronWorkerChatGpt class, which inherits from HumatronAsyncWorker, part of the Python SDK library.
  • Line 20 - implement the abstract process_payload_part method from the HumatronAsyncWorker ABC class.
  • Lines 27, 31, 35 - send requests to the LLM, with the text field value extracted from the request body.
  • Line 26 - handle the message request with channel_type equal to email. We return an email response to the question contained in the email, sent to the sender's address.
  • Line 30 - handle the message request with channel_type equal to SMS. We return an SMS response to the question contained in the message, sent to the sender's phone number.
  • Line 34 - handle the message request with channel_type equal to slack. We return a Slack message response to the question contained in the Slack request, sent to the channel from which the request was received.
NOTE: this example does not check for possible system states nor validates request input data.
The remaining task is to wrap this AI worker implementation as a web service. In this example, we will use Python Flask web server and utility integration provided by Python SDK:
spec_rest.py
hide
1# Web Server Integration Example. 2import os 3from dotenv import load_dotenv 4from humatron.worker.rest.flask.flask_server import start_flask_server 5from demo import HumatronWorkerChatGpt 6 7# Start the REST server. 8def start() -> None: 9 # Load the environment variables. 10 load_dotenv() 11 12 # Get the tokens from the environment. 13 req_token, resp_token = os.environ['HUMATRON_REQUEST_TOKEN'], os.environ['HUMATRON_RESPONSE_TOKEN'] 14 openai_api_key = os.environ['OPENAI_API_KEY'] 15 host, port, url = os.environ['REST_HOST'], int(os.environ['REST_PORT']), os.environ['REST_URL_WORKER'] 16 17 worker = HumatronWorkerChatGpt(openai_api_key=openai_api_key) 18 start_flask_server(worker, req_token, resp_token, host, port, url, None, None, lambda: worker.close()) 19 20if __name__ == '__main__': 21 start()
Comments below:
  • Line 4 - import classes from the Python SDK library.
  • Lines 5, 18 - import and create a new instance of HumatronWorkerChatGpt we developed above.
Once you have your web service running, all you have to do is to submit and publish your build on humatron platform. During submission you’ll have a number of options to set like support for interview, compensation, support for agentic workflow, etc.
When initially submitted the build is in private mode so that only members of your team can have access to it, i.e. interview and hire. It allows you to do the final testing and improvements. Once ready you can make your build public and accessible for everyone on Humatron platform - or keep it private accessible only to your company, the choice is yours.
That’s all! 🎉