2 minute read

Categories:
Tags:

The free tier of GitHub Packages has limited bandwidth to download private artifacts; which can make it unsuitable for use in a CI/CD pipeline for projects on a budget. In an effort to increase GitHub Packages’ usability, this article develops an alternative approach minimizing the dependency on GitHub Packages as hot storage, but preserving it as a viable cold storage, durable storage solution. Building out a cost-effective CI/CD pipeline on the GitHub platform means utilizing the unlimited egress bandwidth afforded to GitHub Actions to its fullest potential.

GitHub Packages as a Maven repository

In an earlier article, Downloading from GitHub Packages using HTTP and Maven we investigated GitHub Packages as a Maven repository for Java artifacts. Standard practices with any network service is evaluating the benefit of implementing a local cache. A local cache can speed up downloads, allow customized permissions, and increased resiliency against network failures. Even if a local Maven repository proxy cache such as Artifactory or Nexus is selected, the question remains of how to get artifacts into the local cache if GitHub Packages transfer limits are being hit.

GitHub Actions have Unlimited Egress (Transfer-Out)

In a GitHub CI/CD pipeline, where compilation occurs within a GitHub Action, a solution is to utilize the unlettered egress bandwidth available during post-compilation actions. Compiled artifacts can be stored both in GitHub Packages and transferred to a Nexus or Artifactory proxy cache when bandwidth is available for free, avoiding metered egress being made from the proxy to GitHub Packages.

Security and Implementation of a push solution

Maven Repository Checksums

The question is now of how to securely transfer files out of a GitHub Actions to remote endpoints. All receivers should accept only authenticated requests from GitHub Actions. Does this require receivers to implement token authentication? Practically speaking no, taking a step back this isn’t about user authentication, it is about file authentication. For a file to be authentic it needs to exist within GitHub Packages. If the receiver is uploaded a file, that file is authentic and secure if and only if it has a corresponding Maven checksum in our GitHub Packages repository.

Authentication Flow

An authentication mechanism beyond GitHub is unwanted and unnecessary. HTTP requests to any external receiver can be authenticated by verifying the request includes a valid GitHub auth token. The files the requests are uploading can be validated by accessing GitHub Packages using the GitHub auth token and comparing checksums.

GitHub Actions file push using POST
GitHub Action to POST artifacts using HTTP uploads to an external receiver

HTTP Server for receiving and validating GitHub Action upload requests

This project is basically server-side deployment scripts written in Scala, with Akka HTTP receiving builds from GitHub, so it can easily be integrated as a Route of existing Akka HTTP / Play deployments. User Permissions

Upload permissions are limited to the ability to publish to GitHub Packages Maven.

Server-side permissions are completely internal to your server. Two Deployment Parts

SBT build tasks

publishAssemblyToGitHubPackages: pushes compiled code to GitHub Packages (Maven)
uploadAssemblyByPut: pushes compiled code to your server (HTTP PUT)

HTTP Upload Server

built on Akka, handles HTTP PUT
validates upload is latest version in Maven, and has correct MD5 checksum
performs any custom server-side tasks, such as deployment and restarting

Sources

GitHub

HTTP Maven Receiver

HTTP server that receives artifact uploads and verifies MD5 against Maven.
Other Posts in this Series

Compiling Scala Native in a GitHub Action; Alternatives to GraalVM

11 minute read

, ,

Scala Native is a compiler and JDK written in Scala with the goal of removing Scala’s dependency on the JVM. This isn’t meant to achieve a higher performance such as with JDKs, and it is targeting a specialized use-case not considered to be today’s typical Scala development. Its competitors are Rust and Go, not GraalVM, Java or Kotlin. This article goes through common steps and challenges encountered when compiling Scala Native for linux with a GitHub Action.

Http4s Streams and Multipart Form-Data File Uploads

10 minute read

, ,

Streaming is the primary mechanism to reduce memory requirements for processing large datasets. The approach is to view only a small window of data at a time, allowing data to stream through in manageable amounts matching the data window size to the amount of RAM available. A practical example is a file-upload, where multi-GBs file streams can be handled by MBs of server RAM. However, enforcing streaming in software code is prone to errors, and misuse or incompatible method implementations will lead to breaking stream semantics, and ultimately to OOM exceptions. This article focuses on streams within the context of file uploads, using the Http4s library for examples.

Multiplexed Services in Finagle

6 minute read

, ,

Apache Thrift is a pretty good RPC library. Methods compose a service, and the service is hosted on a raw TCP port. Even a large implementation with a hundred methods will perform effortlessly, but for organizational purposes you’ll want to group calls together into separate services. The standard thrift protocols require that each service retain exclusive use to its own TCP port, creating a firewall maintenance nightmare.

Ostrich. Not just for stats, also for documentation.

2 minute read

, ,

Ostrich is a stats collector and reporter created by Twitter, and it is a welcome addition to any Finagle (Apache Thrift) implementation. At its core it uses an extremely lightweight com.sun.net.httpserver.HttpServer to handle JSON and HTML requests.

Developer Friendly Thrift Request Logging

3 minute read

, ,

In a system of async service calls, sometimes the most indispensable debugging tool is knowing what and when network traffic is occurring. Unfortunately for developers Finagle’s default protocol is binary, which while undecipherable, it can be transformed into something a lot more useful.

Finagle Query Cache with Guava

4 minute read

, ,

For many data services, any easy way to reduce database load is to cache calls to semi-static data (ie: append-only, or refreshed only on a set schedule), and very recent calls due to backward user navigation. Not all methods and data are suitable for caching, so any implementation will require the ability to be selective.

Reusing Finagle Server Filters on the Client

2 minute read

, ,

When using Thrift, Finagle Filters on the client inherit from SimpleFilter[ThriftClientRequest, Array[Byte]], while on the server they must inherit from SimpleFilter[Array[Byte], Array[Byte]]. In this article, we will demonstrate one approach to creating a dual-function filter without repeating code.

Thrift Client Side Caching to Speed Up Unit Tests

4 minute read

, ,

One of the largest headaches associated with network system architecture is abstracting away the network. External resources are always slower and more disjoint than working locally. While there are various caching techniques, few are suitable for use in a development environment.

Tracking Clients with Finagle

3 minute read

, ,

In a Service Oriented Architecture, a service may be used by many different clients – each with different usage patterns and performance profiles. Behind a corporate firewall, without each client authenticating itself to our server, how can monitor a specific client if we can’t identify their requests?

Binary Semaphore Filter

5 minute read

, ,

Long-running queries are very taxing on a database because they hold on to resources making them unavailable to other requests. And what happens if multiple identical requests are made while one is still running? Is there a natural way to share the same result, or do each request need to perform their own? This isn’t a simple caching solution, it’s more like a subscription behaviour to a running query if one already exists, if not one is created.

Transitioning C# to Scala Using Thrift

less than 1 minute read

A 30-minute presentation on Sept 19th at the Scala-Toronto Meetup. The slides introduce a technical application of Apache Thrift and additional features offered by the Twitter Finagle.

Separation of Concerns with Finagle

3 minute read

, ,

The Separation of Concerns (SoC) pattern is one of those software architectural choices that everyone is helpful. It increases clarity, shortens the amount of code in the working context, and minimizes the chance of side effects. For example, two concerns that should not require entanglement: updating data and cache invalidation. Both are related, but one is concerned about business logic and database access, while the other deals with the cache servers. Finagle’s generated FutureIface can be used to keep these two separate.

Finagle ServerSet Clusters using Zookeeper

5 minute read

, ,

The key to high availability is redundancy; it follows that if uptime matters, Finagle needs to be deployed to multiple servers. This article walks through both the basic multi-host configuration using finagle-core, and a more robust deployment scenario utilizing the finagle-serversets module.

Web Components

1 minute read

At the heart of a web page, there are UI elements and these elements interact: with the user, each other, and the server. Although HTML5 expanded the original set of elements to include audio, video, and date pickers, there has been no standard way to define custom elements. Elements not specified in the HTML specification have had no support thrusting this responsibility onto client-side and server-side web frameworks.

Maintainable Web Development without JavaScript

2 minute read

In the years 2002-2003, Internet Explorer captured 95% of world-wide browser market share. It was unfathomable to many that over the next 10 years IE would decline to just over 15%.

Polymer Data-Binding Filters

2 minute read

One useful feature of modern JavaScript libraries is 2-way data-binding. All interactive websites perform this functionality one way or another, but only a few libraries such as Ember.js, AngularJS and Polymer don’t require a single line of JavaScript.

Sortable Table with Polymer Web Components

4 minute read

As businesses now rely more heavily on web applications to perform daily operations, a user-friendly datatable/spreadsheet is indispensable to all web developers. While individual requirements vary, the core staple is the sortable table. Using Polymer’s Templates and Data-Binding, one can be implemented in a remarkably concise way.

ThreadLocal Variables and Scala Futures

5 minute read

Thread-Local storage (TLS) allows static variables to be attached to the currently executing thread. The most common use of TLS is to allow global context to be available throughout the entire call stack without passing it explicitly as a method parameters. In a web-application, this allows contextual request metadata, such as the URL, to be referenced anywhere within the code handing the request; which is extremely useful for logging or auditing purposes.

Reactive Front-End with Web Components

less than 1 minute read

The Reactive Manifesto puts together the ideal architecture for today’s system infrastructure, designed to cope with ever-increasing need for performance, reliability and responsiveness. The same evolution of expectations is taking place in the JavaScript front-end, but do the same ideas and principles apply?

Is JavaScript Replacing HTML?

7 minute read

Over time there has been an ebb and flow to the ratio of JavaScript:HTML used in websites. What motivates the change, and where is this ratio ultimately headed?

Emoji Progress Bar for SaaS Integrations

2 minute read

, ,

The command line progress bar was the first step towards graphical UI. It was an exciting addition to a numerical percent ticking away as a running task took forever to complete. It started with safe for everywhere ascii characters

Scala SBT Publishing to GitHub Packages

8 minute read

, , ,

GitHub Packages is a natural extension of a CI/CD pipeline created in GitHub Action. It currently offers repositories for Java (Maven), .Net (NuGet), Ruby (Gems), JavaScript (npm), and Docker images. For a lot of users this can be a free private service if you can squeeze under the size limitation and are okay using OAuth keys managed in GitHub.

Downloading from GitHub Packages using HTTP and Maven

8 minute read

, ,

GitHub Packages is a Maven compatible repository accessible outside of GitHub. It serves as the code repository used in Java project compilation both on workstations and within a CI/CD pipeline, as well as allowing manual file downloads through the GitHub web interface. Because it is meant for only these 2 purposes there is no REST API available making custom integrations more difficult than need be. This article documents the URLs exposed through Maven which can be used to create an API of simple HTTP commands. URLs to browse packages and download files will be covered, as well as steps to more effectively use free tier resources allowed on private repositories.

Lit Custom Components for SVG Generation

4 minute read

, ,

SVG markup is very similar to HTML, and the Lit Web Components library can be used to not only generate HTML custom components, but also manipulate SVG in a similar way using Lit templates. Lit is a small 5kb library that removes the boilerplate code of DOM generation, and is highly interoperable with all web frameworks since it relies on browser native custom elements.

Trade Audit Mobile App Infrastructure

3 minute read

, , , , ,

The release of the Trade Audit mobile app is almost here. It is currently in MVP stage, but its infrastructure is a pretty typical cloud based deployment. This article discusses design choices made, evaluating how effective they were.

Akka gRPC with Let’s Encrypt

3 minute read

, ,

The Electric Frontier Foundation (EFF) has recommendations about encrypting the web; there is no reason to be running servers over unencrypted HTTP any longer. It is irresponsible to your users and unnecessary, as such there is a formal mechanism called HTTP Strict Transport Security (HSTS) that enforces HTTPS for all requests at the domain level. Taking it further, modern browsers include a set of domains which can only work over HTTPS, it started with Google TLDs such as .dev and .app, but it is growing https://hstspreload.org/.

ZIO Migration from Akka and Scala Futures

6 minute read

, , ,

Akka / Apache Pekko is a robust and popular Scala framework used to build concurrent production-grade software. One of the concurrency primitives it uses is the standard scala.concurrent.Future class. Before these existed in Scala, there was the Twitter Future offering similar, but expanded functionality including cancellation / interruptibility. Ignoring the functional coding style promoted by ZIO for a second, the concurrency primitive used by ZIO, known as ZIO[R, E, A] can be viewed as a more advanced Future[A].

SBT Parallel Compile Optimizations using Quill Sub-Projects

9 minute read

, ,

Scala is slow to compile. Advanced syntax constructs and a robust type system can increase developer productivity and runtime reliability but also create extra work for the compiler. Macro libraries such as Quill are effectively programs written for the compiler, and can represent an unbounded amount of work depending on what they are trying to accomplish. Are there ways to structure our Scala 3 code to ensure that we can embrace the rich macro ecosystem without excessively long compile times?

The Many Different Forms of Concurrency and Parallelism

10 minute read

, ,

Modern software design requires the understanding of the different layers of concurrency and parallelism that can exist. Abstractions exposed by libraries and frameworks can inadvertently hide layers of parallelism when their focus is the simplification of others; and libraries trying to treat all levels of parallelism equality can be limited to low level concepts for their common interface. In order to optimally design and avoid errors, all levels of concurrency and parallelism need to be understood no matter what framework is chosen.

Categories:
Tags:
Updated: