toolbox

Toolbox

⚠️ This repository is now archived. The content lives on at danburzo.ro/toolbox.

This is my collection of useful tricks, tools, libraries, APIs, data sources & other fun things.

Separate pages

On this page

JavaScript libraries

(The ★ star denotes popular, high-quality, choices.)

Image processing, optical character recognition (OCR)

Repo Description Notes
tesseract.js Tesseract.js is a javascript library that gets words in almost any language out of images.  
jimp An image processing library written entirely in JavaScript for Node, with zero external or native dependencies.  
omggif JavaScript implementation of a GIF 89a encoder and decoder. For making GIFs. Animated_GIF, for example, builds on top of this library.
exif.js JavaScript library for reading EXIF image metadata. Extracting information from JPEG files.
quaggaJS QuaggaJS is a barcode-scanner entirely written in JavaScript supporting real-time localization and decoding of various types of barcodes such as EAN, CODE 128, CODE 39, EAN 8, UPC-A, UPC-C, I2of5 and CODABAR. The library is also capable of using getUserMedia to get direct access to the user’s camera stream. Although the code relies on heavy image- processing even recent smartphones are capable of locating and decoding barcodes in real-time. For reading barcodes with a webcam or phone camera. See also JSBarcode for generating barcodes.
Lanczos.js Lancszos image resampling. This is for resizing images nicely in the browser.
sharp High performance Node.js image processing, the fastest module to resize JPEG, PNG, WebP and TIFF images. Uses the libvips library. Has native dependencies.

Drawing, illustration, motion

Repo Description Notes
chroma.js JavaScript library for all kinds of color manipulations. Convert a color from and to any format, generate pleasing color palettes and color variations. See also TinyColor, d3-color and my own culori.
p5.js A JS client-side library for creating graphic and interactive experiences, based on the core principles of Processing.  
paper.js The Swiss Army Knife of Vector Graphics Scripting.  
opentype.js Read and write OpenType fonts using JavaScript.  
two.js A renderer-agnostic two-dimensional drawing API for the web.  
snap.svg The JavaScript library for modern SVG graphics.  
rune.js A JavaScript library for programming graphic design systems with SVG.  
mojs motion graphics toolbelt for the web  
matter-js A 2D rigid body physics engine for the web  
ccapture.js A library to capture canvas-based animations at a fixed framerate.  
anime.js A lightweight JavaScript animation library with a simple, yet powerful API. It works with CSS properties, SVG, DOM attributes and JavaScript Objects.  

Data visualization

Repo Description Notes
d3 A JavaScript visualization library for HTML and SVG. The tool of choice for any data visualization project. To draw standard charts with D3, take a look at some libraries built on top of it: MetricsGraphics, plottable, d3plus, victory, semiotic. Further reading: awesome-d3.
vega-lite Vega-Lite is a high-level grammar of interactive graphics. It provides a concise JSON syntax for rapidly generating visualizations to support analysis.  
cytoscape.js Graph theory (a.k.a. network) library for analysis and visualisation.  
dagre Directed graph renderer for JavaScript Built on top of graphlib by the same author, dagre helps you build directed graphs that you can then plug into D3 or cytoscape. See the excellent wiki for details.

3D, VR, WebGL

Repo Description Notes
three.js JavaScript 3D library. Much like D3 is the tool of choice for data visualization, three.js is the established library for working in 3D. See also pre3d (although it hasn’t been worked on in ages).
regl Fast functional WebGL.  
aframe Building blocks for the VR Web.  
blotter A JavaScript API for drawing unconventional text effects on the web. It uses three.js under the hood.
pex PEX is a collection of modular software components that work together well to create a tool for computation thinking.  
ejecta A Fast, Open Source JavaScript, Canvas & Audio Implementation for iOS Can help you bring things built with three.js, etc. to iOS.

Audio

Repo Description Notes
Tone.js A Web Audio framework for making interactive music in the browser.  
paulstretch.js This is a JavaScript implementation - with a few improvements - of Paul’s Extreme Sound Stretch algorithm (Paulstretch) by Nasca Octavian PAUL.  
tuna An audio effects library for Web Audio.  
MIDI.js Making life easy to create a MIDI-app on the web.  
teoria Javascript taught Music Theory  
WebMidi.js WebMidi.js helps you tame the Web MIDI API. Send and receive MIDI messages with ease. Control instruments with user-friendly functions (playNote, sendPitchBend, etc.). React to MIDI input with simple event listeners (noteon, pitchbend, controlchange, etc.)  
NexusUI NexusUI is a JavaScript toolkit for easily designing musical interfaces in web browsers and mobile apps, with emphasis on rapid prototyping and integration with web audio.  
Gibberish Fast, JavaScript DSP library that creates JIT optimized audio callbacks using code generation techniques Library behind Gibber — an audiovisual live coding environment for the browser. See also interface.js by the same author — a GUI library for music / arts applications that works with touch, mouse and motion events.
genish.js A library for generating optimized, single-sample audio callbacks in JavaScript. Inspired by gen~ in Max/MSP.  
howler.js howler.js is an audio library for the modern web. It defaults to Web Audio API and falls back to HTML5 Audio. This makes working with audio in JavaScript easy and reliable across all platforms.  

Dig deeper:

Mapping

Repo Description Notes
leaflet.js JavaScript library for mobile-friendly interactive maps.  
Stamen maps This is the code behind the Stamen maps site, which shows off our custom tiles and explains how to get them into other sites.  
turf.js A modular geospatial engine written in JavaScript.  

Gamemaking

Repo Description Notes
coquette A micro framework for JavaScript games. Handles collision detection, the game update loop, canvas rendering, and keyboard and mouse input.  

Working with documents

To create PDFs from scratch, look at pdfkit and jsPDF. To create and edit existing PDF files, pdf-lib. To view PDFs, see pdf.js.

EPUB. starter-book / perfect-edition.

Repo Description Notes
pdf2json A PDF file parser that converts PDF binaries to text-based JSON, powered by a fork of pdf.js  
epub.js EPUBs in the browser.  
magicbook It aims to be the best free tool for creating print and digital books from a single source.  

Algorithms, Data structures

Repo Description Notes
mnemonist Curated collection of data structures for the JavaScript language.  

App architecture

Repo Description Notes
react A declarative, efficient, and flexible JavaScript library for building user interfaces.  
svelte Cybernetically enhanced web apps.  
astro Build faster websites with less client-side JavaScript.  

Prototyping

This section contains tools that make it easier to just start working on your thing without a lot of setup.

Repo Description Notes
storybook Storybook is a development environment for UI components. It allows you to browse a component library, view the different states of each component, and interactively develop and test components. This works as a React (or Vue) playground that lets you focus on writing components.

Web scraping, data extraction

Repo Description Notes
artoo artoo.js is a piece of JavaScript code meant to be run in your browser’s console to provide you with some scraping utilities. Allows you to create custom bookmarklets to scrape a web page from your browser; I could see this being used in places where on a server you’d have to jump through some hoops, e.g. getting your Facebook saved links.
x-ray The next web scraper. See through the <html> noise. From the maker of cheerio, and built on top of it, x-ray gives you a terse, fluent API to scrape and navigate pages.
fathom A framework for extracting meaning from web pages. Seems to be still a work in progress, but it will ultimately be a tool for understanding where some particular type of content is on a page: “Where is the body? The title? Is this a ‘next page’ button? Is this a comment form, and are there comments here? By better understanding the parts of a page, we can improve our understanding of how a user interacts with it.”source
readability A standalone version of the readability library used for Firefox Reader View. Extract the main content from a web page.

Language processing

Repo Description Notes
natural “Natural” is a general natural language facility for nodejs. Tokenizing, stemming, classification, phonetics, tf-idf, WordNet, string similarity, and some inflections are currently supported. At the moment, most of the algorithms are English-specific.  
franc Detect the language of text. franc can tell in which of the 339 supported languages a piece of text is written.
retext Natural language processor powered by plugins.  
lunr.js Simple full-text search in your browser Since lunr.js only supports English by default, supplement it with lunr-languages which is a collection of languages stemmers and stopwords.
nlp_compromise It’s a handy, and not overly-fancy tool for understanding, changing, and playing with English.  
snowball-js javascript implementation of the popular snowball word stemming nlp algorithm  

Parsing things

Repo Description Notes
parse5 parse5 provides nearly everything you may need when dealing with HTML. It’s the fastest spec-compliant HTML parser for Node to date. It parses HTML the way the latest version of your browser does. Use this to get a barebones representation of your HTML
jsdom jsdom is a pure-JavaScript implementation of many web standards, notably the WHATWG DOM and HTML Standards, for use with Node.js. Use this for more convenient methods to interact with the HTML (e.g. querySelectorAll())
css CSS parser / stringifier for Node.js  
postcss PostCSS is a tool for transforming styles with JS plugins. These plugins can lint your CSS, support variables and mixins, transpile future CSS syntax, inline images, and more.  
unified Content as structured data  

Parsing JavaScript

Parsers:

Parsers generally produce an abstract syntax tree (AST) that follows the estree format. Use AST Explorer to look around.

To traverse and change the AST, then write it back as a string:

To evaluate an AST expression, look at eval-estree-expression.

Command-line tools:

Building CLIs

Repo Description Notes
commander.js node.js command-line interfaces made easy  
sade    

12 Factor CLI apps

Data sets

Repo Description Notes
Words A huge dataset of words in four languages (English, German, Spanish and French) used in Atebits’ game Letterpress.  
corpora A collection of small corpuses of interesting data for the creation of bots and similar stuff. I also keep a repo, inspired by it.
Natural Earth Vector A global, public domain map dataset available at three scales and featuring tightly integrated vector and raster data.  
countries World countries in JSON, CSV and XML.  
geonames contains over 10 million geographical names and consists of over 9 million unique features whereof 2.8 million populated places and 5.5 million alternate names.  
Geofabrik OSM Data Extracts   On continent/country level.
Mapzen Metro Extracts City-sized portions of OpenStreetMap, served weekly  
all-the-cities All the 138,398 cities of the world with a population of at least 1000 inhabitants, in a big JSON array.  
Awesome public datasets    
whiskyverse   JSON file containing Scotch Malt Whisky Society bottles

APIs

Link Description Notes
Wordnik The Wordnik API lets you request definitions, example sentences, spelling suggestions, related words like synonyms and antonyms, phrases containing a given word, word autocompletion, random words, words of the day, and much more.  
poetrydb PoetryDB is an API for internet poets.  
Echo Nest The Echo Nest offers an incredible array of music data and services for developers to build amazing apps and experiences. See also echonest/remix.  
Openapi 🇷🇴 Provides information about companies, IP geolocation, validation for CIF / CNP / IBAN / BIC, exchange rate, postal codes, singular and plural forms of Romanian words  

Command-Line Tools

General purpose

puppeteer

Run a headless version of Chrome from Node.js

See also alternatives and to standard command-line tools.

Working with specific formats

See also: structured text tools

jq

For processing JSON files. Reshaping JSON with jq.

See also: fx, gron

pup

For processing HTML. It reads from stdin, prints to stdout, and allows the user to filter parts of the page using CSS selectors.

Should work great with wget for web page data extraction.

fonttools

Does TTF/OTF conversion to and from XML. This allows you to edit fonts (e.g. metadata) in plain-text and then rebuild them.

osmosis

Filter & merge OpenStreetMap data files (XML, PBF).

electron-pdf

Generate a PDF from an URL, HTML or Markdown file.

textkit

For manipulating and analyzing text.

monolith

For saving complete web pages as a single HTML file.

csvkit

A suite of utilities for converting to and working with CSV, the king of tabular file formats.

Utilities

For de-warping scans

unproject_text: perspective recovery of text using transformed ellipses. Write-up.

page_dewarp: page dewarping and thresholding using a “cubic sheet” model. Write-up.

For upscaling images

RAISR: Google Rapid and Accurate Image Super Resolution is a technique to use Machine Learning to upscale images. There are a few implementations of the algorithm on GitHub: movehand/raisr, MKFMIKU/RAISR


Online tools

Places for your code

Link Description Notes
Observable Observable is a better way to code. Discover insights faster and communicate more effectively with interactive notebooks for data analysis, visualization, and exploration. See this introductory series to get up to speed.
Glitch Glitch is the friendly community where you’ll build the app of your dreams.  
CodePen CodePen is a social development environment for front-end designers and developers.  
jsComplete   This is the simplest way I found to write React code like you would in a notepad. You can’t save your work, but it’s perfect for quick sketching. It also loads the latest version of React (16.2.0 at the moment).
JS Bin    
JSFiddle    

Mapping

Link Description Notes
geojson.io geojson.io is a quick, simple tool for creating, viewing, and sharing maps. geojson.io is named after GeoJSON, an open source data format, and it supports GeoJSON in all ways - but also accepts KML, GPX, CSV, GTFS, TopoJSON, and other formats.  
Overpass Turbo A web-based data filtering tool for OpenStreetMap. With overpass turbo you can run Overpass API queries and analyse the resulting OSM data interactively on a map. There is an integrated Wizard which makes creating queries super easy.  

Miscellaneous tips & tricks

Start a server for a folder

Run python -m SimpleHTTPServer in your project folder to make it available at http://localhost:8000.

Note: The syntax above is for the Python 2.x that comes preinstalled with macOS and Linux. The equivalent syntax for Python 3 is python -m http.server.

If you’re using Node.js, you can also use the serve package, which you can run without installing with npx (it comes with recent versions of npm):

npx serve

An even simpler server is servor:

npx servor

Also take a look at budo.

Fetch a file from the web

To fetch a file from the web with the command line, using wget is straightforward:

wget http://download.geofabrik.de/europe/romania-latest.osm.pbf

👉 See more wget tricks

Or, fetching a file in Node:

require('http').get('http://download.geofabrik.de/europe/romania-latest.osm.pbf', function(response) {
    response.pipe(require('fs').createWriteStream('romania-latest.osm.pbf'));
});

Make an S3 bucket publicly available

To use a S3 bucket to keep a bunch of files and make them publicly available, you need to make a bucket policy. This is under Permissions section on the Properties tab for your bucket. Paste this into the bucket policy (myBucketName should be the name of your bucket):

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "PublicReadGetObject",
			"Effect": "Allow",
			"Principal": "*",
			"Action": "s3:GetObject",
			"Resource": "arn:aws:s3:::myBucketName/*"
		}
	]
}

All the files inside will typically be available at http://myBucketName.s3.amazonaws.com/path/to/file but you can grab the exact URL from the Static Website Hosting section in the Properties tab.

Extra credit: This is super-useful for hosting video files, since S3 supports partial content requests which is needed to loop <video> on your web pages.

See also

Colophon

This list was inspired by javierarce/toolbox.