This website is currently dormant!

API Definitions News

These are the news items I've curated in my monitoring of the API space that have some relevance to the API definition conversation and I wanted to include in my research. I'm using all of these links to better understand how the space is defining not just their APIs, but their schema, and other moving parts of their API operations.

Making Sure My API Dependencies Include Data Provenance

I am publishing a new API for locations. I am tired of needing some of the same location based resources across projects, and not having a simple, standardized API I can depend on. So I got to work finding the most accurate and complete data set I could find of cities, regions, and countries. I settled on using the complete, and easy to use countries-regions-cities project by David Graham–providing a straightforward SQL script I can use as the seed for my locations API database.

After crafting an API for this database using AWS API Gateway and Lambda, and working my way down my API checklist, it occurred to me that I wanted to include David Graham’s work as one of the project dependencies. Giving him attribution, while honestly acknowledging my project’s dependency on the data he provided. I’m working hard to include all dependencies within each of the microservices that I’m publishing, being mindful of every data, code, and human dependency that exists behind each service I deliver. Even if I don’t rely on regular updates from them, I still want to acknowledge their contribution, and consider attribution as one layer of my API dependency discussion.

Having a dependency section of my API checklist has helped me evolve how I think about defining the dependencies my services have. I initially began tracking all other services that my microservices were dependent on, but then I quickly began adding details about the other software, data, and people the service depends on as well. I’m also pulling together a machine readable definition for tracking on my microservice dependencies. It will be something I include in the API discovery (APIs.json) document for each service, alongside the OpenAPI, and other specifications. Allowing me to track on the dependencies (and attribution) for all of my APIs, and API related artifacts that I am producing on a regular basis. Providing data provenance for each of my services, documenting the origins of all the data I’m using across my services, and making accessible via an API.

For me, having the data provenance behind each service provides me with a nice clean inventory of all my suppliers. Understanding the data, services, open source code, and people I depend on to deliver a service is important to helping me make sense of my operations. For the people behind the data, services, and open source code I depend on it helps provide attribution, and showcase their valuable contribution to the services I offer. For partner and 3rd party consumers of my services, being observable about the dependencies that exist behind a service they are depending on, helps them make much more educated decisions around which services they put to work, and bake into their applications and systems. In the end, everyone is better off if I invest in data provenance as part of my wider API dependency efforts.

An OpenAPI Service Dependency Vendor Extensions

I’m working on a healthcare related microservice project, and I’m looking for a way to help developers express their service dependencies within the OpenAPI or some other artifact. At this point I’m feeling like the OpenAPI is the place to articulate this, adding a vendor extension to the specification that can allow for the referencing of one or more other services any particular service is dependent on. Helping make service discovery more machine readable at discovery and runtime.

To help not reinvent the wheel, I am looking at using the Web API type including the extensions put forth by Mike Ralphson and team. I’d like the x-api-dependencies collection to adopt a standardized schema, that was flexible enough to reference different types of other services. I’d like to see the following elements be present for each dependency:

  • versions (OPTIONAL array of thing -> Property -> softwareVersion). It is RECOMMENDED that APIs be versioned using [semver]
  • entryPoints (OPTIONAL array of Thing -> Intangible -> EntryPoint)
  • license (OPTIONAL, CreativeWork or URL) - the license for the design/signature of the API
  • transport (enumerated Text: HTTP, HTTPS, SMTP, MQTT, WS, WSS etc)</p>
  • apiProtocol (OPTIONAL, enumerated Text: SOAP, GraphQL, gRPC, Hydra, JSON API, XML-RPC, JSON-RPC etc)
  • webApiDefinitions (OPTIONAL array of EntryPoints) containing links to machine-readable API definitions
  • webApiActions (OPTIONAL array of potential Actions)

Using the Web type would allow for a pretty robust way to reference dependencies between services in a machine readable way, that can be indexed, and even visualized in services and tooling. When it comes to evolving and moving forward services, having dependency details baked in by default make a lot of sense, and ideally each dependency definition would have all the details of the dependency, as well as potential contact information, to make sure everyone is connected regarding the service road map. Anytime a service is being deprecated, versioned, or impacted in any way, we have all the dependencies needed to make an educated decision regarding how to progress with the least amount of friction as possible.

I’m going to go ahead and create a draft OpenAPI vendor extension specification for x-service-dependencies, and use the WebAPI type, complete with the added extensions. Once I start using it, and have successfully implemented it for a handful of services I will publish and share a working example. I’m also on the hunt for other examples of how teams are doing this. I’m not looking for code dependency management solutions, I am specifically looking for API dependency management solutions, and how teams are making these dependencies discoverable in a machine readable way. If you know of any interesting approaches, please let me know, I’d like to hear more about it.

Considering How Machine Learning APIs Might Violate Privacy and Security

I was reading about how Carbon Black, an endpoint detection and response (EDR) service, was exposing customer data via a 3r party API service they were using. The endpoint detection and response provider allows customers to optionally scan system and program files using the VirusTotal service. Carbon Black did not realize that premium subscribers of the VirusTotal service get access to the submitted files, allowing an company or government agency with premium access to VirusTotal’s application programming interface (API) can mine those files for sensitive data.

It provides a pretty scary glimpse at the future of privacy and security in a world of 3rd party APIs if we don’t think deeply about the solutions we bake into our applications and services. Each API we bake into our applications should always be scrutinized for privacy and security concerns, making sure end-users aren’t being subjected to unnecessary situations. This situation sounds like it was both API provider and consumer contributing to the privacy violation, and adjusting platform access levels, and communicating with API consumers would be the best path forward.

Beyond just this situation, I wanted to write about this topic as a cautionary tale for the unfolding machine learning API landscape. Make sure we are thinking deeply about what data and content we are making available to platforms via artificial intelligence and machine learning APIs. Make sure we are asking the hard questions about the security and privacy of data and content we are running through machine learning APIs. Make sure we are thinking deeply about what data and content sets we are running through the machine learning APIs, and reducing any unnecessary exposure of personal data, content, and media.

It is easy to be captivated by the magic of artificial intelligence and machine learning APIs. It is easy to view APIs as something external, and not much of a privacy or security threat. However, with each API call we are inviting a 3rd party API into our databases, files, and other private systems. Let’s make sure we have an honest conversation with our API providers about how data and content is accessed, stored, cached, and used as part of any AI or ML process. Let’s make sure we get clarification on which partners, or other 3rd party providers are getting access to data and content that is indexed and executed as part of AI and ML API requests and responses. How long are videos or images stored? How long is data stored?

I’m seeing more discussion around dependencies going on in the API space. Which software libraries, and APIs are we depending on for our applications and services. I’m feeling like this conversation is going to continue expanding and security, privacy, and observability is going to become a more significant part of these dependency discussions. It will be a conversation that continues to push API deployment on-premise, and on-premise, being observable about how ML and AI API operations are being logged, stored, and track on. I’m going to keep watching how APIs are intentionally or unintentionally violating security and privacy like this, and keep an eye on the API dependency conversation to see how it evolves as part of this security and privacy discussion.

Providing Code Citations In Machine Learning APIs

I was playing around with the Style Thief, an image transfer API from Algorithmia, and I noticed the citation for the algorithm behind. The API is an adaptation of Anish Athalye’s Neural Style Transfer, and I thought the algorithmic citation of where the work was derived from was an interesting thing to take note of for my machine learning API research.

I noticed on Algorithmia’s page there was a Bibtex citation, which referenced the author, and project Github repository:

@misc{athalye2015neuralstyle, author = {Anish Athalye}, title = {Neural Style}, year = {2015}, howpublished = {\url{}}, note = {commit xxxxxxx} }

This provides an interesting way to address citation in not just machine learning, but with open source driving algorithmic APIs in general. It gives me food for thought when it comes to what licensing I should be considering when wrapping open source software with an API. I’ve been thinking about dependencies a lot lately when it comes to APIs and their definitions, and I’d consider citation or attribution to be in a similar category. I guess rather then technical dependency, it is more in the business and legal dependency category.

Similar to how Pivio allows you to reference dependencies for your microservices, I’m thinking that API Commons, or some other format like Bibtext could provide a machine readable citation, that could be indexed as part of an APIs.json index. Allowing us API designers, architect, and providers to provide proper citation for where our work is derived. These aren’t just technical dependencies, but also business and political dependencies, that we should ensuring are present with each API we deploy, providing an observable provenance of where ideas come from, and a honest look at how we build on the work of each other.

Including API Dependencies Within Your API Definition

I was learning about Pivio, a discovery specification for microservices the other day, and found their focus on microservice dependency to be pretty interesting. API dependencies has been an area I have found myself increasingly thinking about, as well as tracking on in my API research. I’m pretty close to publishing a project dedicated to understanding API, and microservices dependencies, which would overlap with containers, serverless, and other aspects of the API lifecycle that are all about breaking down the monolith.

Each service definition using Pivio has a depends_on object, which allows for defining both internal and external service dependencies. Here is a snippet from a sample Pivio document to help articulate this interesting feature:

This is where you can start connecting the cords between all of your services, something that is applicable to any APIs, whether you’ve drank the microservices kool-aid or not. I find it interesting that Pivio has internal, and external. I’d love to see an OpenAPI linked off each of the services it depends on. I also am fascinated with the first question for external, why? What a great first question for any dependency–it should also be available for internal services as well. Every API provider should be inquiring why a dependency exist whenever possible, and having it quantified in this way just opens up more chances for this question to get asked.

Seeing dependencies quantified in Pivio makes me happy. It has been something I’ve wanted to reflect in APIs.json for some time now. Currently, I do not know of any way to quantify the relationship between APIs, and Pivio provides a glimpse at one possible way we might be able to map this world out. I have been learning more about Cytoscape, a network data integration, analysis, and visualization framework. Having a machine readable API dependencies definition would allow me to create a network visualization of any API or microservices discovery catalog. It wouldn’t take much work at all to map out API and microservice dependencies at the internal and external levels, further helping to quantify the platforms we operate.

If you think there is a link I should have listed here feel free to tweet it at me, or submit as a Github issue. Even though I do this full time, I'm still a one person show, and I miss quite a bit, and depend on my network to help me know what is going on.