What is FIO?
fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user. The typical use of fio is to write a job file matching the I/O load one wants to simulate. – (https://linux.die.net/man/1/fio)
fio can be a great tool for helping to measure workload I/O of a specific application workload on a particular device or file. Fio proves to be a detailed benchmarking tool used for workloads today with many options. I personally came across the tool while working at EMC when needing to benchmark Disk I/O of application running in different Linux container runtimes. This leads me to my next topic.
Why Docker based fio-tools
One of the projects I was working on was using Docker on AWS and various private cloud deployments and we wanted to see how workloads performed on these different cloud environments inside Docker container with various CPU, Memory, Disk I/O limits with various block, flash, or DAS based storage devices.
One way to wanted to do this was to containerize fio and allow users to pass the workload configuration and disk to the container that was doing the testing.
The first part of this was to containerize fio with the option to pass in JOB files by pathname or by a URL such as a raw Github Gist.
The Dockerfile (below) is based on Ubuntu 14 which admittedly can be smaller but we can easily install fio and pass a CMD script called run.sh.
FROM ubuntu:14.10 MAINTAINER <Ryan Wallner email@example.com> RUN sed -i -e 's/archive.ubuntu.com/old-releases.ubuntu.com/g' /etc/apt/sources.list RUN apt-get -y update && apt-get -y install fio wget VOLUME /tmp/fio-data ADD run.sh /opt/run.sh RUN chmod +x /opt/run.sh WORKDIR /tmp/fio-data CMD ["/opt/run.sh"]
What does run.sh do? This script does a few things, is checked that you are passing a JOBFILE name (fio job) which without REMOTEFILES will expect it to exist in `/tmp/fio-data` it also cleans up the fio-data directory by copying the contents which may be jobs files out and then back in while removing any old graphs or output. If the user passes in REMOTEFILES it will be downloaded from the internet with wget before being used.
#!/bin/bash [ -z "$JOBFILES" ] && echo "Need to set JOBFILES" && exit 1; echo "Running $JOBFILES" # We really want no old data in here except the fio script mv /tmp/fio-data/*.fio /tmp/ rm -rf /tmp/fio-data/* mv /tmp/*fio /tmp/fio-data/ if [ ! -z "$REMOTEFILES" ]; then # We really want no old data in here rm -rf /tmp/fio-data/* IFS=' ' echo "Gathering remote files..." for file in $REMOTEFILES; do wget --directory-prefix=/tmp/fio-data/ "$file" done fi fio $JOBFILES
There are two other Dockerfiles that are aimed at doing two other operations. 1. Producing graphs of the output data with fio2gnuplot and serving the graphs and output from a python SimpleHTTPServer on port 8000.
All Dockerfiles and examples can be found here (https://github.com/wallnerryan/fio-tools) and it also includes an All-In-One image that will run the job, generate the graphs and serve them all in one which is called fiotools-aio.
How to use it
- Build the images or use the public images
- Create a Fio Jobfile
- Run the fio-tool image
docker run -v /tmp/fio-data:/tmp/fio-data \ -e JOBFILES= \ wallnerryan/fio-tool
If your file is a remote raw text file, you can use REMOTEFILES
docker run -v /tmp/fio-data:/tmp/fio-data \ -e REMOTEFILES="http://url.com/.fio" \ -e JOBFILES= wallnerryan/fio-tool
Run the fio-genplots script
docker run -v /tmp/fio-data:/tmp/fio-data wallnerryan/fio-genplots \ <fio2gnuplot options>
Serve your Graph Images and Log Files
docker run -p 8000:8000 -d -v /tmp/fio-data:/tmp/fio-data \ wallnerryan/fio-plotserve
Easiest Way, run the “all in one” image. (Will auto produce IOPS and BW graphs and serve them)
docker run -p 8000:8000 -v /tmp/fio-data \ -e REMOTEFILES="http://url.com/.fio" \ -e JOBFILES=<your-fio-jobfile> \ -e PLOTNAME=MyTest \ -d --name MyFioTest wallnerryan/fiotools-aio
- Your fio job file should reference a mount or disk that you would like to run the job file against. In the job fil it will look something like:
directory=/my/mounted/volumeto test against docker volumes
- If you want to run more than one all-in-one job, just use
-v /tmp/fio-datainstead of
-v /tmp/fio-data:/tmp/fio-dataThis is only needed when you run the individual tool images separately
To use with docker and docker volumes
docker run \ -e REMOTEFILES="https://gist.githubusercontent.com/wallnerryan/fd0146ee3122278d7b5f/raw/cdd8de476abbecb5fb5c56239ab9b6eb3cec3ed5/job.fio" \ -v /tmp/fio-data:/tmp/fio-data \ --volume-driver flocker \ -v myvol1:/myvol \ -e JOBFILES=job.fio wallnerryan/fio-tool
To produce graphs, run the fio-genplots container with -t <name of your graph> -p <pattern of your log files>
Produce Bandwidth Graphs
docker run -v /tmp/fio-data:/tmp/fio-data wallnerryan/fio-genplots \ -t My16kAWSRandomReadTest -b -g -p *_bw*
Produce IOPS graphs
docker run -v /tmp/fio-data:/tmp/fio-data wallnerryan/fio-genplots \ -t My16kAWSRandomReadTest -i -g -p *_iops*
Simply serve them on port 8000
docker run -p 8000:8000 -d \ -v /tmp/fio-data:/tmp/fio-data \ wallnerryan/fio-plotserve
To use the all-in-one image
docker run \ -p 8000:8000 \ -v /tmp/fio-data \ -e REMOTEFILES="https://gist.githubusercontent.com/wallnerryan/fd0146ee3122278d7b5f/raw/006ff707bc1a4aae570b33f4f4cd7729f7d88f43/job.fio" \ -e JOBFILES=job.fio \ -e PLOTNAME=MyTest \ —volume-driver flocker \ -v myvol1:/myvol \ -d \ —name MyTest wallnerryan/fiotools-aio
To use with docker-machine/boot2docker/DockerForMac
You can use a remote fit configuration file using the REMOTEFILES env variable.
docker run \ -e REMOTEFILES="https://gist.githubusercontent.com/wallnerryan/fd0146ee3122278d7b5f/raw/d089b6321746fe2928ce3f89fe64b437d1f669df/job.fio" \ -e JOBFILES=job.fio \ -v /Users/wallnerryan/Desktop/fio:/tmp/fio-data \ wallnerryan/fio-tool
(or) If you have a directory that already has them in it. *NOTE*: you must be using a shared folder such as Docker > Preferences > File Sharing.
docker run -v /Users/wallnerryan/Desktop/fio:/tmp/fio-data \ -e JOBFILES=job.fio wallnerryan/fio-tool
To produce graphs, run the genplots container, -p
docker run \ -v /Users/wallnerryan/Desktop/fio:/tmp/fio-data wallnerryan/fio-genplots \ -t My16kAWSRandomReadTest -b -g -p *_bw*
Simply serve them on port 8000
docker run -v /Users/wallnerryan/Desktop/fio:/tmp/fio-data \ -d -p 8000:8000 wallnerryan/fio-plotserve
- The fio-tools container will clean up the /tmp/fio-data volume by default when you re-run it.
- If you want to save any data, copy this data out or save the files locally.
How to get graphs
- When you serve on port 8000, you will have a list of all logs created and plots created, click on the .png files to see graph (see below for example screen)
Testing and building with codefresh
As a side note, I recently added this repository to build on Codefresh. Right now, it builds the fiotools-aio Dockerfile which I find most useful and moves on but it was an easy experience that I wanted to add to the end of this post.
Navigate to https://g.codefresh.io/repositories? or create a free account by logging into codefresh with your Github account. By logging in with Github it will have access to your repositories you gave access to and this is where the fio-tools images are.
I added the repository as a build and configured it like so.
This will automatically build my Dockerfile and run any integration tests and unit tests I may have configured in codefresh, thought right now I have none but will soon add some simple job to run against a file as an integration test with a codefresh composition.
I found over my time using both native linux tools and docker-based or containerized tools that there is need for both sometimes and in fact when testing container-native application workloads sometimes it is best to get metrics or benchmarks from the point of view of the application which is why we chose to run fio as a microservice itself.
Hopefully this was an enjoyable read and thanks for stopping by!