In Golang, bufio
is a package used for buffered IO. Buffering IO is a technique used to temporarily accumulate the results for an IO operation before transmitting it forward. This technique can increase the speed of a program by reducing the number of system calls, which are typically slow operations. In this shot, we will look at some of the abstractions bufio provides for writing and reading operations.
bufio
With bufio
, we can use the bufio.Writer
method to accumulate data into a buffer before writing to IO. In the example below, we have demonstrated three likely situations that you may encounter:
As soon as the buffer is full, the write operation takes place.
If the buffer still has space after the last write, it will not attempt to complete that write until specifically urged to do so by the Flush()
method.
If a write is larger than buffer capacity, the buffer is skipped because there is no need to buffer.
package mainimport ("fmt""bufio")// Writer type used to initialize buffer writertype Writer intfunc (*Writer) Write(p []byte) (n int, err error) {fmt.Printf("Writing: %s\n",p)return len(p), nil}func main() {// declare a buffered writer// with buffer size 4w := new(Writer)bw := bufio.NewWriterSize(w, 4)// Case 1: Writing to buffer until fullbw.Write([]byte{'1'})bw.Write([]byte{'2'})bw.Write([]byte{'3'})bw.Write([]byte{'4'}) // write - buffer is full// Case 2: Buffer has spacebw.Write([]byte{'5'})err := bw.Flush() // forcefully write remainingif err != nil {panic(err)}// Case 3: (too) large write for buffer// Will skip buffer and write directlybw.Write([]byte("12345"))}
We can re-use the same bufio.NewWriterSize
for different writers using the reset()
method:
writerOne := new(Writer)
bw := bufio.NewWriterSize(writerOne,2)
writerTwo := new(Writer)
bw.Reset(writerTwo)
We can check available space with the Available()
method.
bufio
bufio
allows us to read in batches with bufio.Reader
io.Reader
under the hood
Peek
ReadSlice
ReadLine
ReadByte
Scanner
Peek
The Peek
method lets us see the first bytes (referred to as peek value) in the buffer without consuming them. The method operates in the following way.
ReadSlice
ReadSlice
has a signature of:
func (b *Reader) ReadSlice(delim byte) (line []byte, err error)
It returns a slice of the string including the delimiter. For example, if the input is 1, 2, 3 and we use commas as delimiters, the output will be:
1, 2, 3
If the delimiter cannot be found, and EOF has been reached, then io.EOF is returned. If the delimiter is not reached and readSlice
has exceeded buffer capacity, then io.ErrBufferFull is returned.
ReadLine
ReadLine
is defined as:
ReadLine() (line []byte, isPrefix bool, err error)
ReadLine
uses ReadSlice
under the hood. However, it removes new-line characters (\n
or \r\n
) from the returned slice.
Note that its signature is different because it returns the
isPrefix
flag as well. This flag returns true when the delimiter has not been found and the internal buffer is full.
Readline
does not handle lines longer than the internal buffer. We can call it multiple times to finish reading.
ReadByte
ReadByte
has a signature of:
func (b *Reader) ReadBytes(delim byte) ([]byte, error)
Similar to ReadSlice
, ReadBytes
returns slices before and including the delimiter. In fact, ReadByte
works over ReadSlice
, which acts as the underlying low-level function. However, ReadByte
can call multiple instances of ReadSlice
to accumulate return data; therefore, circumventing buffer size limitations. Additionally, since ReadByte
returns a new slice of byte, it is safer to use because consequent read operations will not overwrite the data.
Scanner
Scanner
breaks a stream of data by splitting it into tokens. Scanning stops at EOF, at first IO error, or if a token is too large to fit into the buffer. If more control over error handling is required, use bufio.Reader
.
Scanner
has a signature of:
func NewScanner(r io.Reader) *Scanner
This is the split function used for dividing the text into token defaults to ScanLines
; however, you change it if need be.
package mainimport ("bufio""fmt""strconv""strings")const singleLine string = "I'd love to have some coffee right about now"const multiLine string = "Reading is my...\r\n favourite"func main() {fmt.Println("Lenght of singleLine input is " + strconv.Itoa(len(singleLine)))str := strings.NewReader(singleLine)br := bufio.NewReaderSize(str, 25)fmt.Println("\n---Peek---")// Peek - Case 1: Simple peek implementationb, err := br.Peek(3)if err != nil {fmt.Println(err)}fmt.Printf("%q\n",b) // output: "I'd"// Peek - Case 2: Peek larger than buffer sizeb, err = br.Peek(30)if err != nil {fmt.Println(err) // output: "bufio: buffer full"}// Peek - Case 3: Buffer size larger than stringbr_large := bufio.NewReaderSize(str,50)b, err = br_large.Peek(50)if err != nil {fmt.Println(err) // output: EOF}// ReadSlicefmt.Println("\n---ReadSlice---")str = strings.NewReader(multiLine)r := bufio.NewReader(str)for {token, err := r.ReadSlice('.')if len(token) > 0 {fmt.Printf("Token (ReadSlice): %q\n", token)}if err != nil {break}}// ReadLinefmt.Println("\n---ReadLine---")str = strings.NewReader(multiLine)r = bufio.NewReader(str)for {token, _ , err := r.ReadLine()if len(token) > 0 {fmt.Printf("Token (ReadLine): %q\n", token)}if err != nil {break}}// ReadBytesfmt.Println("\n---ReadBytes---")str = strings.NewReader(multiLine)r.Reset(str)for {token, err := r.ReadBytes('\n')fmt.Printf("Token (ReadBytes): %q\n", token)if err != nil {break}}// Scannerfmt.Println("\n---Scanner---")str = strings.NewReader(multiLine)scanner := bufio.NewScanner(str)for scanner.Scan() {fmt.Printf("Token (Scanner): %q\n", scanner.Text())}}
Read more about
bufio
in the official documentation.