occasional long (500+ ms) Write or Flush times in golang
I have project where we read data from some source, process it, then write some subset and possibly compressed version of that data to disk. We write in many (usually around 200) different files to correspond to different "channels" of the data, and often write a few MB/s to disc across these files.
I'm seeing that approximately once every minute or so, a call to Write
or to Flush
takes a around 500 ms. I'd to reduce that number down to below 50 ms if possible.
I tried to make a minimum working example (MWE) with this odd behvior, but I can't reproduce the long Write
and Flush
calls on the machine that has that behavior with our full project. However, on my MacBook Pro laptop, I can cause 50 ms Flush
calls which is still 100 times longer than typical. The attempted MWE is below, and it reproduces to way we're writing data nearly exactly. It has some flags at the top for flushing in different parts of the code, I was testing to see if calling Flush
in certain locations helped. Any guidance on how to get consistent timings on the Flush
and Write
calls would be super helpful. Even if the average write time was slower, I just want consistent behavior.
package main
import (
"bufio"
"fmt"
"io/ioutil"
"os"
"os/signal"
"sync"
"time"
)
const flushWithinBlock = true
const flushAfterBlocks = true
type Writer struct {
FileName string
headerWritten bool
writer *bufio.Writer
}
func (w *Writer) writeHeader() error {
file, err := os.Create(w.FileName)
if err != nil {
return err
}
w.writer = bufio.NewWriterSize(file, 32768)
w.writer.WriteString("HEADERn")
w.headerWritten = true
return nil
}
func (w *Writer) writeRecord(nBytes int) error {
data := make(byte, nBytes)
nWritten, err := w.writer.Write(data)
if nWritten != nBytes {
return fmt.Errorf("wrong number of bytes written")
}
return err
}
func main() {
dirname, err0 := ioutil.TempDir("", "")
if err0 != nil {
panic(err0)
}
fmt.Println(dirname)
recordLength := 500
numberOfChannels := 240
recordsPerChanPerTick := 5
writers := make(*Writer, numberOfChannels)
abortChan := make(chan struct{})
for i := range writers {
writers[i] = &Writer{FileName: fmt.Sprintf("%v/%v.ljh", dirname, i)}
}
go func() {
signalChan := make(chan os.Signal)
signal.Notify(signalChan, os.Interrupt)
<-signalChan
close(abortChan)
}()
tickDuration := 50 * time.Millisecond
ticker := time.NewTicker(tickDuration)
z := 0
tLast := time.Now()
fmt.Printf("recordsPerChanPerTick %v, Chans %v, tickDuration %vn", recordsPerChanPerTick, numberOfChannels, tickDuration)
fmt.Printf("records/second/chan %v, records/second total %vn", float64(recordsPerChanPerTick)/tickDuration.Seconds(), float64(recordsPerChanPerTick*numberOfChannels)/tickDuration.Seconds())
fmt.Printf("megabytes/second total %vn", float64(recordLength)*float64(recordsPerChanPerTick*numberOfChannels)/tickDuration.Seconds()*1e-6)
fmt.Printf("flushWithinBlock %v, flushAfterBlocks %vn", flushWithinBlock, flushAfterBlocks)
for {
// 1. here we would get data from data source
z++
select {
case <-abortChan:
fmt.Println("clean exit")
return
case <-ticker.C:
var wg sync.WaitGroup
writeDurations := make(time.Duration, numberOfChannels)
flushDurations := make(time.Duration, numberOfChannels)
for i, w := range writers {
wg.Add(1)
go func(w *Writer, i int) {
tStart := time.Now()
defer wg.Done()
for j := 0; j < recordsPerChanPerTick; j++ {
if !w.headerWritten {
err := w.writeHeader()
if err != nil {
panic(fmt.Sprintf("failed create file and write header: %vn", err))
}
}
w.writeRecord(recordLength)
}
tWrite := time.Now()
if flushWithinBlock {
w.writer.Flush()
}
writeDurations[i] = tWrite.Sub(tStart)
flushDurations[i] = time.Now().Sub(tWrite)
}(w, i)
}
wg.Wait()
for _, w := range writers {
if flushAfterBlocks {
w.writer.Flush()
}
}
var writeSum time.Duration
var flushSum time.Duration
var writeMax time.Duration
var flushMax time.Duration
for i := range writeDurations {
writeSum += writeDurations[i]
flushSum += flushDurations[i]
if writeDurations[i] > writeMax {
writeMax = writeDurations[i]
}
if flushDurations[i] > flushMax {
flushMax = flushDurations[i]
}
}
if z%100 == 0 || time.Now().Sub(tLast) > 75*time.Millisecond {
fmt.Printf("z %v, time.Now().Sub(tLast) %vn", z, time.Now().Sub(tLast))
fmt.Printf("writeMean %v, flushMean %v, writeMax %v, flushMax %vn", writeSum/time.Duration(numberOfChannels), flushSum/time.Duration(numberOfChannels), writeMax, flushMax)
}
tLast = time.Now()
}
}
}
Example output on a Ubuntu 16 machine with a spinning hard drive, this is the actual hardware that has 500 ms Write
and Flush
calls in our full project:
/tmp/296105809
recordsPerChanPerTick 5, Chans 240, tickDuration 50ms
records/second/chan 100, records/second total 24000
megabytes/second total 12
flushWithinBlock true, flushAfterBlocks true
z 1, time.Now().Sub(tLast) 75.96973ms
writeMean 14.017745ms, flushMean 7.847µs, writeMax 24.761147ms, flushMax 420.896µs
z 100, time.Now().Sub(tLast) 50.13856ms
writeMean 1.71µs, flushMean 4.213µs, writeMax 12.271µs, flushMax 32.133µs
z 200, time.Now().Sub(tLast) 50.006063ms
writeMean 1.651µs, flushMean 3.032µs, writeMax 79.006µs, flushMax 7.246µs
z 300, time.Now().Sub(tLast) 50.151421ms
writeMean 1.685µs, flushMean 4.542µs, writeMax 10.429µs, flushMax 14.087µs
z 400, time.Now().Sub(tLast) 50.059208ms
Example output on a MacBook Pro with an SSD. You can see some much longer Write
and Flush
calls here, but nothing in the 500 ms range. Note the 30 ms flushMax
on line 81 vs more typical 500 us flushMax
on line 100.
/var/folders/_0/25kp6h7x25v6vyjv2yjlcnkm000wrm/T/934618054
recordsPerChanPerTick 5, Chans 240, tickDuration 50ms
records/second/chan 100, records/second total 24000
megabytes/second total 12
flushWithinBlock true, flushAfterBlocks true
z 1, time.Now().Sub(tLast) 84.897446ms
writeMean 10.265068ms, flushMean 464.53µs, writeMax 24.752873ms, flushMax 3.528286ms
... some output removed
... NOTE, line 81 was printed because it took longer than normal
z 81, time.Now().Sub(tLast) 75.804358ms
writeMean 15.056µs, flushMean 18.324892ms, writeMax 408.406µs, flushMax 30.765425ms
z 100, time.Now().Sub(tLast) 54.753448ms
writeMean 3.25µs, flushMean 84.963µs, writeMax 74.152µs, flushMax 499.322µs
file go io flush goroutine
add a comment |
I have project where we read data from some source, process it, then write some subset and possibly compressed version of that data to disk. We write in many (usually around 200) different files to correspond to different "channels" of the data, and often write a few MB/s to disc across these files.
I'm seeing that approximately once every minute or so, a call to Write
or to Flush
takes a around 500 ms. I'd to reduce that number down to below 50 ms if possible.
I tried to make a minimum working example (MWE) with this odd behvior, but I can't reproduce the long Write
and Flush
calls on the machine that has that behavior with our full project. However, on my MacBook Pro laptop, I can cause 50 ms Flush
calls which is still 100 times longer than typical. The attempted MWE is below, and it reproduces to way we're writing data nearly exactly. It has some flags at the top for flushing in different parts of the code, I was testing to see if calling Flush
in certain locations helped. Any guidance on how to get consistent timings on the Flush
and Write
calls would be super helpful. Even if the average write time was slower, I just want consistent behavior.
package main
import (
"bufio"
"fmt"
"io/ioutil"
"os"
"os/signal"
"sync"
"time"
)
const flushWithinBlock = true
const flushAfterBlocks = true
type Writer struct {
FileName string
headerWritten bool
writer *bufio.Writer
}
func (w *Writer) writeHeader() error {
file, err := os.Create(w.FileName)
if err != nil {
return err
}
w.writer = bufio.NewWriterSize(file, 32768)
w.writer.WriteString("HEADERn")
w.headerWritten = true
return nil
}
func (w *Writer) writeRecord(nBytes int) error {
data := make(byte, nBytes)
nWritten, err := w.writer.Write(data)
if nWritten != nBytes {
return fmt.Errorf("wrong number of bytes written")
}
return err
}
func main() {
dirname, err0 := ioutil.TempDir("", "")
if err0 != nil {
panic(err0)
}
fmt.Println(dirname)
recordLength := 500
numberOfChannels := 240
recordsPerChanPerTick := 5
writers := make(*Writer, numberOfChannels)
abortChan := make(chan struct{})
for i := range writers {
writers[i] = &Writer{FileName: fmt.Sprintf("%v/%v.ljh", dirname, i)}
}
go func() {
signalChan := make(chan os.Signal)
signal.Notify(signalChan, os.Interrupt)
<-signalChan
close(abortChan)
}()
tickDuration := 50 * time.Millisecond
ticker := time.NewTicker(tickDuration)
z := 0
tLast := time.Now()
fmt.Printf("recordsPerChanPerTick %v, Chans %v, tickDuration %vn", recordsPerChanPerTick, numberOfChannels, tickDuration)
fmt.Printf("records/second/chan %v, records/second total %vn", float64(recordsPerChanPerTick)/tickDuration.Seconds(), float64(recordsPerChanPerTick*numberOfChannels)/tickDuration.Seconds())
fmt.Printf("megabytes/second total %vn", float64(recordLength)*float64(recordsPerChanPerTick*numberOfChannels)/tickDuration.Seconds()*1e-6)
fmt.Printf("flushWithinBlock %v, flushAfterBlocks %vn", flushWithinBlock, flushAfterBlocks)
for {
// 1. here we would get data from data source
z++
select {
case <-abortChan:
fmt.Println("clean exit")
return
case <-ticker.C:
var wg sync.WaitGroup
writeDurations := make(time.Duration, numberOfChannels)
flushDurations := make(time.Duration, numberOfChannels)
for i, w := range writers {
wg.Add(1)
go func(w *Writer, i int) {
tStart := time.Now()
defer wg.Done()
for j := 0; j < recordsPerChanPerTick; j++ {
if !w.headerWritten {
err := w.writeHeader()
if err != nil {
panic(fmt.Sprintf("failed create file and write header: %vn", err))
}
}
w.writeRecord(recordLength)
}
tWrite := time.Now()
if flushWithinBlock {
w.writer.Flush()
}
writeDurations[i] = tWrite.Sub(tStart)
flushDurations[i] = time.Now().Sub(tWrite)
}(w, i)
}
wg.Wait()
for _, w := range writers {
if flushAfterBlocks {
w.writer.Flush()
}
}
var writeSum time.Duration
var flushSum time.Duration
var writeMax time.Duration
var flushMax time.Duration
for i := range writeDurations {
writeSum += writeDurations[i]
flushSum += flushDurations[i]
if writeDurations[i] > writeMax {
writeMax = writeDurations[i]
}
if flushDurations[i] > flushMax {
flushMax = flushDurations[i]
}
}
if z%100 == 0 || time.Now().Sub(tLast) > 75*time.Millisecond {
fmt.Printf("z %v, time.Now().Sub(tLast) %vn", z, time.Now().Sub(tLast))
fmt.Printf("writeMean %v, flushMean %v, writeMax %v, flushMax %vn", writeSum/time.Duration(numberOfChannels), flushSum/time.Duration(numberOfChannels), writeMax, flushMax)
}
tLast = time.Now()
}
}
}
Example output on a Ubuntu 16 machine with a spinning hard drive, this is the actual hardware that has 500 ms Write
and Flush
calls in our full project:
/tmp/296105809
recordsPerChanPerTick 5, Chans 240, tickDuration 50ms
records/second/chan 100, records/second total 24000
megabytes/second total 12
flushWithinBlock true, flushAfterBlocks true
z 1, time.Now().Sub(tLast) 75.96973ms
writeMean 14.017745ms, flushMean 7.847µs, writeMax 24.761147ms, flushMax 420.896µs
z 100, time.Now().Sub(tLast) 50.13856ms
writeMean 1.71µs, flushMean 4.213µs, writeMax 12.271µs, flushMax 32.133µs
z 200, time.Now().Sub(tLast) 50.006063ms
writeMean 1.651µs, flushMean 3.032µs, writeMax 79.006µs, flushMax 7.246µs
z 300, time.Now().Sub(tLast) 50.151421ms
writeMean 1.685µs, flushMean 4.542µs, writeMax 10.429µs, flushMax 14.087µs
z 400, time.Now().Sub(tLast) 50.059208ms
Example output on a MacBook Pro with an SSD. You can see some much longer Write
and Flush
calls here, but nothing in the 500 ms range. Note the 30 ms flushMax
on line 81 vs more typical 500 us flushMax
on line 100.
/var/folders/_0/25kp6h7x25v6vyjv2yjlcnkm000wrm/T/934618054
recordsPerChanPerTick 5, Chans 240, tickDuration 50ms
records/second/chan 100, records/second total 24000
megabytes/second total 12
flushWithinBlock true, flushAfterBlocks true
z 1, time.Now().Sub(tLast) 84.897446ms
writeMean 10.265068ms, flushMean 464.53µs, writeMax 24.752873ms, flushMax 3.528286ms
... some output removed
... NOTE, line 81 was printed because it took longer than normal
z 81, time.Now().Sub(tLast) 75.804358ms
writeMean 15.056µs, flushMean 18.324892ms, writeMax 408.406µs, flushMax 30.765425ms
z 100, time.Now().Sub(tLast) 54.753448ms
writeMean 3.25µs, flushMean 84.963µs, writeMax 74.152µs, flushMax 499.322µs
file go io flush goroutine
1
I'm not sure where the exact solution lies between Go and the operating system, but if you watch a graph of disk I/O when you run this, you'll notice (at least on my Mac with SSD) very little activity, but the graph will spike and the Go test will spit out a bunch of high latency write times. Somewhere along the pipeline from memory to disk, you need to reduce the write buffer size (which may be a bad idea depending on specific application). As to exactly where, I'm not sure, but it's very possibly in the OS, or disk level. Graph spike: imgur.com/a/ATNGSKE
– RayfenWindspear
Nov 20 '18 at 21:15
1
ok two comments 1) the fact that you can't produce a MWE shows that it is probably the OS or other programs causing issues 2) if you launch a go routine to do the update what does it matter how long it takes? nothing is blocked :)
– Vorsprung
Nov 20 '18 at 21:30
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines are mainly for parallelism of processing. If I cant figure out how to resolve the occasional slowWrite
then I'll re-write it to shuttle put all the data to be written on a channel and have a file writing routine.
– gggg
Nov 20 '18 at 21:45
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines primarily to avoid race conditions related to changing what data is written. I'm thinking about how to restructure it to 1. not wait on write 2. avoid race conditions when write settings change 3. guarantee in order writing while still processing data every 50 ms.
– gggg
Nov 20 '18 at 22:01
add a comment |
I have project where we read data from some source, process it, then write some subset and possibly compressed version of that data to disk. We write in many (usually around 200) different files to correspond to different "channels" of the data, and often write a few MB/s to disc across these files.
I'm seeing that approximately once every minute or so, a call to Write
or to Flush
takes a around 500 ms. I'd to reduce that number down to below 50 ms if possible.
I tried to make a minimum working example (MWE) with this odd behvior, but I can't reproduce the long Write
and Flush
calls on the machine that has that behavior with our full project. However, on my MacBook Pro laptop, I can cause 50 ms Flush
calls which is still 100 times longer than typical. The attempted MWE is below, and it reproduces to way we're writing data nearly exactly. It has some flags at the top for flushing in different parts of the code, I was testing to see if calling Flush
in certain locations helped. Any guidance on how to get consistent timings on the Flush
and Write
calls would be super helpful. Even if the average write time was slower, I just want consistent behavior.
package main
import (
"bufio"
"fmt"
"io/ioutil"
"os"
"os/signal"
"sync"
"time"
)
const flushWithinBlock = true
const flushAfterBlocks = true
type Writer struct {
FileName string
headerWritten bool
writer *bufio.Writer
}
func (w *Writer) writeHeader() error {
file, err := os.Create(w.FileName)
if err != nil {
return err
}
w.writer = bufio.NewWriterSize(file, 32768)
w.writer.WriteString("HEADERn")
w.headerWritten = true
return nil
}
func (w *Writer) writeRecord(nBytes int) error {
data := make(byte, nBytes)
nWritten, err := w.writer.Write(data)
if nWritten != nBytes {
return fmt.Errorf("wrong number of bytes written")
}
return err
}
func main() {
dirname, err0 := ioutil.TempDir("", "")
if err0 != nil {
panic(err0)
}
fmt.Println(dirname)
recordLength := 500
numberOfChannels := 240
recordsPerChanPerTick := 5
writers := make(*Writer, numberOfChannels)
abortChan := make(chan struct{})
for i := range writers {
writers[i] = &Writer{FileName: fmt.Sprintf("%v/%v.ljh", dirname, i)}
}
go func() {
signalChan := make(chan os.Signal)
signal.Notify(signalChan, os.Interrupt)
<-signalChan
close(abortChan)
}()
tickDuration := 50 * time.Millisecond
ticker := time.NewTicker(tickDuration)
z := 0
tLast := time.Now()
fmt.Printf("recordsPerChanPerTick %v, Chans %v, tickDuration %vn", recordsPerChanPerTick, numberOfChannels, tickDuration)
fmt.Printf("records/second/chan %v, records/second total %vn", float64(recordsPerChanPerTick)/tickDuration.Seconds(), float64(recordsPerChanPerTick*numberOfChannels)/tickDuration.Seconds())
fmt.Printf("megabytes/second total %vn", float64(recordLength)*float64(recordsPerChanPerTick*numberOfChannels)/tickDuration.Seconds()*1e-6)
fmt.Printf("flushWithinBlock %v, flushAfterBlocks %vn", flushWithinBlock, flushAfterBlocks)
for {
// 1. here we would get data from data source
z++
select {
case <-abortChan:
fmt.Println("clean exit")
return
case <-ticker.C:
var wg sync.WaitGroup
writeDurations := make(time.Duration, numberOfChannels)
flushDurations := make(time.Duration, numberOfChannels)
for i, w := range writers {
wg.Add(1)
go func(w *Writer, i int) {
tStart := time.Now()
defer wg.Done()
for j := 0; j < recordsPerChanPerTick; j++ {
if !w.headerWritten {
err := w.writeHeader()
if err != nil {
panic(fmt.Sprintf("failed create file and write header: %vn", err))
}
}
w.writeRecord(recordLength)
}
tWrite := time.Now()
if flushWithinBlock {
w.writer.Flush()
}
writeDurations[i] = tWrite.Sub(tStart)
flushDurations[i] = time.Now().Sub(tWrite)
}(w, i)
}
wg.Wait()
for _, w := range writers {
if flushAfterBlocks {
w.writer.Flush()
}
}
var writeSum time.Duration
var flushSum time.Duration
var writeMax time.Duration
var flushMax time.Duration
for i := range writeDurations {
writeSum += writeDurations[i]
flushSum += flushDurations[i]
if writeDurations[i] > writeMax {
writeMax = writeDurations[i]
}
if flushDurations[i] > flushMax {
flushMax = flushDurations[i]
}
}
if z%100 == 0 || time.Now().Sub(tLast) > 75*time.Millisecond {
fmt.Printf("z %v, time.Now().Sub(tLast) %vn", z, time.Now().Sub(tLast))
fmt.Printf("writeMean %v, flushMean %v, writeMax %v, flushMax %vn", writeSum/time.Duration(numberOfChannels), flushSum/time.Duration(numberOfChannels), writeMax, flushMax)
}
tLast = time.Now()
}
}
}
Example output on a Ubuntu 16 machine with a spinning hard drive, this is the actual hardware that has 500 ms Write
and Flush
calls in our full project:
/tmp/296105809
recordsPerChanPerTick 5, Chans 240, tickDuration 50ms
records/second/chan 100, records/second total 24000
megabytes/second total 12
flushWithinBlock true, flushAfterBlocks true
z 1, time.Now().Sub(tLast) 75.96973ms
writeMean 14.017745ms, flushMean 7.847µs, writeMax 24.761147ms, flushMax 420.896µs
z 100, time.Now().Sub(tLast) 50.13856ms
writeMean 1.71µs, flushMean 4.213µs, writeMax 12.271µs, flushMax 32.133µs
z 200, time.Now().Sub(tLast) 50.006063ms
writeMean 1.651µs, flushMean 3.032µs, writeMax 79.006µs, flushMax 7.246µs
z 300, time.Now().Sub(tLast) 50.151421ms
writeMean 1.685µs, flushMean 4.542µs, writeMax 10.429µs, flushMax 14.087µs
z 400, time.Now().Sub(tLast) 50.059208ms
Example output on a MacBook Pro with an SSD. You can see some much longer Write
and Flush
calls here, but nothing in the 500 ms range. Note the 30 ms flushMax
on line 81 vs more typical 500 us flushMax
on line 100.
/var/folders/_0/25kp6h7x25v6vyjv2yjlcnkm000wrm/T/934618054
recordsPerChanPerTick 5, Chans 240, tickDuration 50ms
records/second/chan 100, records/second total 24000
megabytes/second total 12
flushWithinBlock true, flushAfterBlocks true
z 1, time.Now().Sub(tLast) 84.897446ms
writeMean 10.265068ms, flushMean 464.53µs, writeMax 24.752873ms, flushMax 3.528286ms
... some output removed
... NOTE, line 81 was printed because it took longer than normal
z 81, time.Now().Sub(tLast) 75.804358ms
writeMean 15.056µs, flushMean 18.324892ms, writeMax 408.406µs, flushMax 30.765425ms
z 100, time.Now().Sub(tLast) 54.753448ms
writeMean 3.25µs, flushMean 84.963µs, writeMax 74.152µs, flushMax 499.322µs
file go io flush goroutine
I have project where we read data from some source, process it, then write some subset and possibly compressed version of that data to disk. We write in many (usually around 200) different files to correspond to different "channels" of the data, and often write a few MB/s to disc across these files.
I'm seeing that approximately once every minute or so, a call to Write
or to Flush
takes a around 500 ms. I'd to reduce that number down to below 50 ms if possible.
I tried to make a minimum working example (MWE) with this odd behvior, but I can't reproduce the long Write
and Flush
calls on the machine that has that behavior with our full project. However, on my MacBook Pro laptop, I can cause 50 ms Flush
calls which is still 100 times longer than typical. The attempted MWE is below, and it reproduces to way we're writing data nearly exactly. It has some flags at the top for flushing in different parts of the code, I was testing to see if calling Flush
in certain locations helped. Any guidance on how to get consistent timings on the Flush
and Write
calls would be super helpful. Even if the average write time was slower, I just want consistent behavior.
package main
import (
"bufio"
"fmt"
"io/ioutil"
"os"
"os/signal"
"sync"
"time"
)
const flushWithinBlock = true
const flushAfterBlocks = true
type Writer struct {
FileName string
headerWritten bool
writer *bufio.Writer
}
func (w *Writer) writeHeader() error {
file, err := os.Create(w.FileName)
if err != nil {
return err
}
w.writer = bufio.NewWriterSize(file, 32768)
w.writer.WriteString("HEADERn")
w.headerWritten = true
return nil
}
func (w *Writer) writeRecord(nBytes int) error {
data := make(byte, nBytes)
nWritten, err := w.writer.Write(data)
if nWritten != nBytes {
return fmt.Errorf("wrong number of bytes written")
}
return err
}
func main() {
dirname, err0 := ioutil.TempDir("", "")
if err0 != nil {
panic(err0)
}
fmt.Println(dirname)
recordLength := 500
numberOfChannels := 240
recordsPerChanPerTick := 5
writers := make(*Writer, numberOfChannels)
abortChan := make(chan struct{})
for i := range writers {
writers[i] = &Writer{FileName: fmt.Sprintf("%v/%v.ljh", dirname, i)}
}
go func() {
signalChan := make(chan os.Signal)
signal.Notify(signalChan, os.Interrupt)
<-signalChan
close(abortChan)
}()
tickDuration := 50 * time.Millisecond
ticker := time.NewTicker(tickDuration)
z := 0
tLast := time.Now()
fmt.Printf("recordsPerChanPerTick %v, Chans %v, tickDuration %vn", recordsPerChanPerTick, numberOfChannels, tickDuration)
fmt.Printf("records/second/chan %v, records/second total %vn", float64(recordsPerChanPerTick)/tickDuration.Seconds(), float64(recordsPerChanPerTick*numberOfChannels)/tickDuration.Seconds())
fmt.Printf("megabytes/second total %vn", float64(recordLength)*float64(recordsPerChanPerTick*numberOfChannels)/tickDuration.Seconds()*1e-6)
fmt.Printf("flushWithinBlock %v, flushAfterBlocks %vn", flushWithinBlock, flushAfterBlocks)
for {
// 1. here we would get data from data source
z++
select {
case <-abortChan:
fmt.Println("clean exit")
return
case <-ticker.C:
var wg sync.WaitGroup
writeDurations := make(time.Duration, numberOfChannels)
flushDurations := make(time.Duration, numberOfChannels)
for i, w := range writers {
wg.Add(1)
go func(w *Writer, i int) {
tStart := time.Now()
defer wg.Done()
for j := 0; j < recordsPerChanPerTick; j++ {
if !w.headerWritten {
err := w.writeHeader()
if err != nil {
panic(fmt.Sprintf("failed create file and write header: %vn", err))
}
}
w.writeRecord(recordLength)
}
tWrite := time.Now()
if flushWithinBlock {
w.writer.Flush()
}
writeDurations[i] = tWrite.Sub(tStart)
flushDurations[i] = time.Now().Sub(tWrite)
}(w, i)
}
wg.Wait()
for _, w := range writers {
if flushAfterBlocks {
w.writer.Flush()
}
}
var writeSum time.Duration
var flushSum time.Duration
var writeMax time.Duration
var flushMax time.Duration
for i := range writeDurations {
writeSum += writeDurations[i]
flushSum += flushDurations[i]
if writeDurations[i] > writeMax {
writeMax = writeDurations[i]
}
if flushDurations[i] > flushMax {
flushMax = flushDurations[i]
}
}
if z%100 == 0 || time.Now().Sub(tLast) > 75*time.Millisecond {
fmt.Printf("z %v, time.Now().Sub(tLast) %vn", z, time.Now().Sub(tLast))
fmt.Printf("writeMean %v, flushMean %v, writeMax %v, flushMax %vn", writeSum/time.Duration(numberOfChannels), flushSum/time.Duration(numberOfChannels), writeMax, flushMax)
}
tLast = time.Now()
}
}
}
Example output on a Ubuntu 16 machine with a spinning hard drive, this is the actual hardware that has 500 ms Write
and Flush
calls in our full project:
/tmp/296105809
recordsPerChanPerTick 5, Chans 240, tickDuration 50ms
records/second/chan 100, records/second total 24000
megabytes/second total 12
flushWithinBlock true, flushAfterBlocks true
z 1, time.Now().Sub(tLast) 75.96973ms
writeMean 14.017745ms, flushMean 7.847µs, writeMax 24.761147ms, flushMax 420.896µs
z 100, time.Now().Sub(tLast) 50.13856ms
writeMean 1.71µs, flushMean 4.213µs, writeMax 12.271µs, flushMax 32.133µs
z 200, time.Now().Sub(tLast) 50.006063ms
writeMean 1.651µs, flushMean 3.032µs, writeMax 79.006µs, flushMax 7.246µs
z 300, time.Now().Sub(tLast) 50.151421ms
writeMean 1.685µs, flushMean 4.542µs, writeMax 10.429µs, flushMax 14.087µs
z 400, time.Now().Sub(tLast) 50.059208ms
Example output on a MacBook Pro with an SSD. You can see some much longer Write
and Flush
calls here, but nothing in the 500 ms range. Note the 30 ms flushMax
on line 81 vs more typical 500 us flushMax
on line 100.
/var/folders/_0/25kp6h7x25v6vyjv2yjlcnkm000wrm/T/934618054
recordsPerChanPerTick 5, Chans 240, tickDuration 50ms
records/second/chan 100, records/second total 24000
megabytes/second total 12
flushWithinBlock true, flushAfterBlocks true
z 1, time.Now().Sub(tLast) 84.897446ms
writeMean 10.265068ms, flushMean 464.53µs, writeMax 24.752873ms, flushMax 3.528286ms
... some output removed
... NOTE, line 81 was printed because it took longer than normal
z 81, time.Now().Sub(tLast) 75.804358ms
writeMean 15.056µs, flushMean 18.324892ms, writeMax 408.406µs, flushMax 30.765425ms
z 100, time.Now().Sub(tLast) 54.753448ms
writeMean 3.25µs, flushMean 84.963µs, writeMax 74.152µs, flushMax 499.322µs
file go io flush goroutine
file go io flush goroutine
asked Nov 20 '18 at 20:27
gggggggg
1,3141816
1,3141816
1
I'm not sure where the exact solution lies between Go and the operating system, but if you watch a graph of disk I/O when you run this, you'll notice (at least on my Mac with SSD) very little activity, but the graph will spike and the Go test will spit out a bunch of high latency write times. Somewhere along the pipeline from memory to disk, you need to reduce the write buffer size (which may be a bad idea depending on specific application). As to exactly where, I'm not sure, but it's very possibly in the OS, or disk level. Graph spike: imgur.com/a/ATNGSKE
– RayfenWindspear
Nov 20 '18 at 21:15
1
ok two comments 1) the fact that you can't produce a MWE shows that it is probably the OS or other programs causing issues 2) if you launch a go routine to do the update what does it matter how long it takes? nothing is blocked :)
– Vorsprung
Nov 20 '18 at 21:30
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines are mainly for parallelism of processing. If I cant figure out how to resolve the occasional slowWrite
then I'll re-write it to shuttle put all the data to be written on a channel and have a file writing routine.
– gggg
Nov 20 '18 at 21:45
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines primarily to avoid race conditions related to changing what data is written. I'm thinking about how to restructure it to 1. not wait on write 2. avoid race conditions when write settings change 3. guarantee in order writing while still processing data every 50 ms.
– gggg
Nov 20 '18 at 22:01
add a comment |
1
I'm not sure where the exact solution lies between Go and the operating system, but if you watch a graph of disk I/O when you run this, you'll notice (at least on my Mac with SSD) very little activity, but the graph will spike and the Go test will spit out a bunch of high latency write times. Somewhere along the pipeline from memory to disk, you need to reduce the write buffer size (which may be a bad idea depending on specific application). As to exactly where, I'm not sure, but it's very possibly in the OS, or disk level. Graph spike: imgur.com/a/ATNGSKE
– RayfenWindspear
Nov 20 '18 at 21:15
1
ok two comments 1) the fact that you can't produce a MWE shows that it is probably the OS or other programs causing issues 2) if you launch a go routine to do the update what does it matter how long it takes? nothing is blocked :)
– Vorsprung
Nov 20 '18 at 21:30
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines are mainly for parallelism of processing. If I cant figure out how to resolve the occasional slowWrite
then I'll re-write it to shuttle put all the data to be written on a channel and have a file writing routine.
– gggg
Nov 20 '18 at 21:45
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines primarily to avoid race conditions related to changing what data is written. I'm thinking about how to restructure it to 1. not wait on write 2. avoid race conditions when write settings change 3. guarantee in order writing while still processing data every 50 ms.
– gggg
Nov 20 '18 at 22:01
1
1
I'm not sure where the exact solution lies between Go and the operating system, but if you watch a graph of disk I/O when you run this, you'll notice (at least on my Mac with SSD) very little activity, but the graph will spike and the Go test will spit out a bunch of high latency write times. Somewhere along the pipeline from memory to disk, you need to reduce the write buffer size (which may be a bad idea depending on specific application). As to exactly where, I'm not sure, but it's very possibly in the OS, or disk level. Graph spike: imgur.com/a/ATNGSKE
– RayfenWindspear
Nov 20 '18 at 21:15
I'm not sure where the exact solution lies between Go and the operating system, but if you watch a graph of disk I/O when you run this, you'll notice (at least on my Mac with SSD) very little activity, but the graph will spike and the Go test will spit out a bunch of high latency write times. Somewhere along the pipeline from memory to disk, you need to reduce the write buffer size (which may be a bad idea depending on specific application). As to exactly where, I'm not sure, but it's very possibly in the OS, or disk level. Graph spike: imgur.com/a/ATNGSKE
– RayfenWindspear
Nov 20 '18 at 21:15
1
1
ok two comments 1) the fact that you can't produce a MWE shows that it is probably the OS or other programs causing issues 2) if you launch a go routine to do the update what does it matter how long it takes? nothing is blocked :)
– Vorsprung
Nov 20 '18 at 21:30
ok two comments 1) the fact that you can't produce a MWE shows that it is probably the OS or other programs causing issues 2) if you launch a go routine to do the update what does it matter how long it takes? nothing is blocked :)
– Vorsprung
Nov 20 '18 at 21:30
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines are mainly for parallelism of processing. If I cant figure out how to resolve the occasional slow
Write
then I'll re-write it to shuttle put all the data to be written on a channel and have a file writing routine.– gggg
Nov 20 '18 at 21:45
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines are mainly for parallelism of processing. If I cant figure out how to resolve the occasional slow
Write
then I'll re-write it to shuttle put all the data to be written on a channel and have a file writing routine.– gggg
Nov 20 '18 at 21:45
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines primarily to avoid race conditions related to changing what data is written. I'm thinking about how to restructure it to 1. not wait on write 2. avoid race conditions when write settings change 3. guarantee in order writing while still processing data every 50 ms.
– gggg
Nov 20 '18 at 22:01
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines primarily to avoid race conditions related to changing what data is written. I'm thinking about how to restructure it to 1. not wait on write 2. avoid race conditions when write settings change 3. guarantee in order writing while still processing data every 50 ms.
– gggg
Nov 20 '18 at 22:01
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53401020%2foccasional-long-500-ms-write-or-flush-times-in-golang%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53401020%2foccasional-long-500-ms-write-or-flush-times-in-golang%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
I'm not sure where the exact solution lies between Go and the operating system, but if you watch a graph of disk I/O when you run this, you'll notice (at least on my Mac with SSD) very little activity, but the graph will spike and the Go test will spit out a bunch of high latency write times. Somewhere along the pipeline from memory to disk, you need to reduce the write buffer size (which may be a bad idea depending on specific application). As to exactly where, I'm not sure, but it's very possibly in the OS, or disk level. Graph spike: imgur.com/a/ATNGSKE
– RayfenWindspear
Nov 20 '18 at 21:15
1
ok two comments 1) the fact that you can't produce a MWE shows that it is probably the OS or other programs causing issues 2) if you launch a go routine to do the update what does it matter how long it takes? nothing is blocked :)
– Vorsprung
Nov 20 '18 at 21:30
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines are mainly for parallelism of processing. If I cant figure out how to resolve the occasional slow
Write
then I'll re-write it to shuttle put all the data to be written on a channel and have a file writing routine.– gggg
Nov 20 '18 at 21:45
@Vorsprung, the way it's written now, I do wait on the file writing since the goroutines primarily to avoid race conditions related to changing what data is written. I'm thinking about how to restructure it to 1. not wait on write 2. avoid race conditions when write settings change 3. guarantee in order writing while still processing data every 50 ms.
– gggg
Nov 20 '18 at 22:01