...

/

Improving the colStats Tool to Process Files Concurrently

Improving the colStats Tool to Process Files Concurrently

Learn how to improve the colStats tool to process files.

As we noticed from the tracer output, the tool is processing files sequentially. This isn’t efficient since several files have to be processed. By changing the program to process files concurrently, we can benefit from multiprocessor machines and use more CPUs, generally making the program run faster. The program will spend less time waiting for resources and more time processing files.

Add the sync package

One of the main benefits of Go is its concurrency model. Go includes concurrency primitives that allow us to add concurrency to our programs in a more intuitive way. By using goroutines and channels, we can modify the current colStats tool to process several files concurrently by making changes just to the run() function. The other functions remain unchanged.

First, we add the sync package to the imports section, which provides synchronization types such as the WaitGroup:

import (
"flag"
"fmt"
"io"
"os"
"sync"
)

Updating the run() function

We’ll update the run() function to process files concurrently by creating a new goroutine for each file we need to process. But first we’ll need to create some channels to communicate between the goroutines.

We’ll use three channels:

  • resCh of type chan []float64 to communicate results of processing each file.
  • errCh of type chan error to communicate potential errors.
  • doneCh of type chan struct{}
...