* . *
  • About
  • Advertise
  • Privacy & Policy
  • Contact
Wednesday, May 21, 2025
Earth-News
  • Home
  • Business
  • Entertainment
    Jennifer Lawrence and Robert Pattinson Did This “Humiliating” Thing to Prep for Sex Scenes – Yahoo

    Jennifer Lawrence and Robert Pattinson’s Hilarious Preparation for Steamy Sex Scenes!

    Meet the Cast of FX’s New Comedy ‘Adults’ – WFXG

    Meet the Cast of FX’s New Comedy ‘Adults’ – WFXG

    First Celebrities Playing in the 2025 American Century Championship Announced – El Paso Inc.

    Exciting Lineup Revealed: First Celebrities Set to Compete in the 2025 American Century Championship!

    Palm Beach Council approves Breakers plans for new Family Entertainment Center – Palm Beach Daily News

    Palm Beach Council Greenlights Exciting New Family Entertainment Center at The Breakers!

    Why Was Kanye West’s South Korea Tour Cancelled? – Yahoo

    Unraveling the Mystery: Why Kanye West’s South Korea Tour Was Canceled

    Entertainment Spotlight: ‘Georgie & Mandy’s First Marriage’ – Atlanta News First

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Cellino’s iPSC Manufacturing Technology Receives FDA Advanced Manufacturing Technology (AMT) Designation – Business Wire

    Cellino’s Groundbreaking iPSC Manufacturing Technology Earns FDA’s Advanced Manufacturing Designation!

    New technology adapting the study of Anatomy | Why’s Guy’s – WCIA.com

    Revolutionizing Anatomy: How Cutting-Edge Technology is Transforming the Study of the Human Body

    We Think GigaCloud Technology’s (NASDAQ:GCT) Solid Earnings Are Understated – Yahoo Finance

    Unlocking Potential: Why GigaCloud Technology’s Impressive Earnings Deserve More Recognition

    JPMorgan Chase plans to spend $18B in technology in 2025 (JPM:NYSE) – Seeking Alpha

    JPMorgan Chase Unveils Ambitious $18 Billion Tech Investment Plan for 2025!

    Nvidia plans to sell tech to speed AI chip communication – The Indian Express

    Nvidia Unveils Game-Changing Technology to Accelerate AI Chip Communication!

    Murfreesboro LPR technology helps catch suspect in Henry County homicide case – WKRN News 2

    Murfreesboro LPR technology helps catch suspect in Henry County homicide case – WKRN News 2

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
  • Home
  • Business
  • Entertainment
    Jennifer Lawrence and Robert Pattinson Did This “Humiliating” Thing to Prep for Sex Scenes – Yahoo

    Jennifer Lawrence and Robert Pattinson’s Hilarious Preparation for Steamy Sex Scenes!

    Meet the Cast of FX’s New Comedy ‘Adults’ – WFXG

    Meet the Cast of FX’s New Comedy ‘Adults’ – WFXG

    First Celebrities Playing in the 2025 American Century Championship Announced – El Paso Inc.

    Exciting Lineup Revealed: First Celebrities Set to Compete in the 2025 American Century Championship!

    Palm Beach Council approves Breakers plans for new Family Entertainment Center – Palm Beach Daily News

    Palm Beach Council Greenlights Exciting New Family Entertainment Center at The Breakers!

    Why Was Kanye West’s South Korea Tour Cancelled? – Yahoo

    Unraveling the Mystery: Why Kanye West’s South Korea Tour Was Canceled

    Entertainment Spotlight: ‘Georgie & Mandy’s First Marriage’ – Atlanta News First

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Cellino’s iPSC Manufacturing Technology Receives FDA Advanced Manufacturing Technology (AMT) Designation – Business Wire

    Cellino’s Groundbreaking iPSC Manufacturing Technology Earns FDA’s Advanced Manufacturing Designation!

    New technology adapting the study of Anatomy | Why’s Guy’s – WCIA.com

    Revolutionizing Anatomy: How Cutting-Edge Technology is Transforming the Study of the Human Body

    We Think GigaCloud Technology’s (NASDAQ:GCT) Solid Earnings Are Understated – Yahoo Finance

    Unlocking Potential: Why GigaCloud Technology’s Impressive Earnings Deserve More Recognition

    JPMorgan Chase plans to spend $18B in technology in 2025 (JPM:NYSE) – Seeking Alpha

    JPMorgan Chase Unveils Ambitious $18 Billion Tech Investment Plan for 2025!

    Nvidia plans to sell tech to speed AI chip communication – The Indian Express

    Nvidia Unveils Game-Changing Technology to Accelerate AI Chip Communication!

    Murfreesboro LPR technology helps catch suspect in Henry County homicide case – WKRN News 2

    Murfreesboro LPR technology helps catch suspect in Henry County homicide case – WKRN News 2

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
Earth-News
No Result
View All Result
Home Technology

The One Billion Row Challenge in Go: from 1m45s to 4s in nine solutions

March 3, 2024
in Technology
The One Billion Row Challenge in Go: from 1m45s to 4s in nine solutions
Share on FacebookShare on Twitter

March 2024

Go to: Baseline | Solutions | Table of results | Final comments

I saw the One Billion Row Challenge a couple of weeks ago, and it thoroughly nerd-sniped me, so I went to Go solve it.

I’m late to the party, as the original competition was in January. It was also in Java. I’m not particularly interested in Java, but I’ve been interested in optimising Go code for a while.

This challenge was particularly simple: process a text file of weather station names and temperatures, and for each weather station, print out the minimum, mean, and maximum. There are a few other constraints to make it simpler, though I ignored the Java-specific ones.

Here are a few lines of example input:

Hamburg;12.0
Bulawayo;8.9
Palembang;38.8
St. John’s;15.2
Cracow;12.6
…

The only catch: the input file has one billion rows (lines). That’s about 13GB of data. I’ve already figured out that disk I/O is no longer the bottleneck – it’s usually memory allocations and parsing that slow things down in a program like this.

This article describes the nine solutions I wrote in Go, each faster than the previous. The first, a simple and idiomatic solution, runs in 1 minute 45 seconds on my machine, while the last one runs in about 4 seconds. As I go, I’ll show how I used Go’s profiler to see where the time was being spent.

The run-down of solutions is as follows, slowest to fastest:

r1: simple and idiomatic
r2: map with pointer values
r3: parse temperatures by hand
r4: fixed point integers
r5: avoid bytes.Cut
r6: avoid bufio.Scanner
r7: custom hash table
r8: parallelise r1
r9: parallelise r7

I wanted each of the solutions to be portable Go using only the standard library: no assembly, no unsafe, and no memory-mapped files. And 4 seconds, or 3.2GB/s, was fast enough for me. For comparison, the fastest, heavily-optimised Java solution runs in just under a second on my machine – not bad!

There are several other Go solutions out there already, and at least one nice write-up. Mine is faster than some solutions, but slightly slower than the fastest one. However, I didn’t look at any of these before writing mine – I wanted my solutions to be independent.

If you just care about the numbers, skip down to the table of results.

Baseline

Here are a few different baseline measurements to set the stage. First, how long does it take to just read 13GB of data, using cat:

$ time cat measurements.txt>/dev/null
0m1.052s

Note that that’s a best-of-five measurement, so I’m allowing the file to be cached. Who knows whether Linux will allow all 13GB to be kept in disk cache, though presumably it does, because the first time it took closer to 6 seconds.

For comparison, to actually do something with the file is significantly slower: wc takes almost a minute:

$ time wc measurements.txt
1000000000 1179173106 13795293380 measurements.txt
0m55.710s

For a simple solution to the actual problem, I’d probably start with AWK. This solution uses Gawk, because sorting the output is easier with its asorti function. I’m using the -b option to use “characters as bytes” mode, which makes things a bit faster:

$ time gawk -b -f 1brc.awk measurements.txt>measurements.out
7m35.567s

I’m sure I can beat 7 minutes with even a simple Go solution, so let’s start there.

I’m going to start by optimising the sequential, single-core version (solutions 1-7), and then parallelise it (solutions 8 and 9). All my results are done using Go 1.21.5 on a linux/amd64 laptop with a fast SSD drive and 32GB of RAM.

Many of my solutions, and most of the fastest solutions, assume valid input. For example, that the temperatures have exactly one decimal digit. Several of my solutions will cause a runtime panic, or produce incorrect output, if the input isn’t valid.

Solution 1: simple and idiomatic Go

I wanted my first version to be simple, straight-forward code using the tools in the Go standard library: bufio.Scanner to read the lines, strings.Cut to split on the ‘;’, strconv.ParseFloat to parse the temperatures, and an ordinary Go map to accumulate the results.

I’ll include this first solution in full (but after that, show only the interesting bits):

func r1(inputPath string, output io.Writer) error {
type stats struct {
min, max, sum float64
count int64
}

f, err :=os.Open(inputPath)
if err !=nil {
return err
}
defer f.Close()

stationStats :=make(map[string]stats)

scanner :=bufio.NewScanner(f)
for scanner.Scan() {
line :=scanner.Text()
station, tempStr, hasSemi :=strings.Cut(line, “;”)
if !hasSemi {
continue
}

temp, err :=strconv.ParseFloat(tempStr, 64)
if err !=nil {
return err
}

s, ok :=stationStats[station]
if !ok {
s.min=temp
s.max=temp
s.sum=temp
s.count=1
} else {
s.min=min(s.min, temp)
s.max=max(s.max, temp)
s.sum +=temp
s.count++
}
stationStats[station]=s
}

stations :=make([]string, 0, len(stationStats))
for station :=range stationStats {
stations=append(stations, station)
}
sort.Strings(stations)

fmt.Fprint(output, “{“)
for i, station :=range stations {
if i> 0 {
fmt.Fprint(output, “, “)
}
s :=stationStats[station]
mean :=s.sum / float64(s.count)
fmt.Fprintf(output, “%s=%.1f/%.1f/%.1f”, station, s.min, mean, s.max)
}
fmt.Fprint(output, “}n”)
return nil
}

This basic solution processes the one billion rows in 1 minute and 45 seconds. A definite improvement over 7 minutes for the AWK solution.

Solution 2: map with pointer values

I’d learned doing my count-words program that we’re doing a bunch more hashing than we need to. For each line, we’re hashing the string twice: once when we try to fetch the value from the map, and once when we update the map.

To avoid that, we can use a map[string]*stats (pointer values) and update the pointed-to struct, instead of a map[string]stats and updating the hash table itself.

However, I first wanted to confirm that using the Go profiler. It only takes a few lines to add CPU profiling to a Go program.

$ ./go-1brc -cpuprofile=cpu.prof -revision=1 measurements-10000000.txt>measurements-10000000.out
Processed 131.6MB in 965.888929ms
$ go tool pprof -http=: cpu.prof
…

Those commands produced the following profile of solution 1, run over a cut-down 10 million row input file:

Profile of solution r1

Map operations are taking a full 30% of the time: 12.24% for assign and 17.35% for lookup. We should be able to get rid of most of the map assign time by using a pointer value.

As a side note, this profile image also shows us where all the rest of the time is going:

Scanning lines with Scanner.Scan
Finding the ‘;’ with strings.Cut
Parsing the temperature with strconv.ParseFloat
Calling Scanner.Text, which allocates a string for the line

In any case, my second solution is just a small tweak to the map operations:

stationStats :=make(map[string]*stats)
scanner :=bufio.NewScanner(f)
for scanner.Scan() {
// …
s :=stationStats[station]
if s==nil {
stationStats[station]=&stats{
min: temp,
max: temp,
sum: temp,
count: 1,
}
} else {
s.min=min(s.min, temp)
s.max=max(s.max, temp)
s.sum +=temp
s.count++
}
}

In the common case where the station exists in the map, we now only do one map operation, s :=stationStats[station], so that hashing the station name and accessing the hash table only has to be done once. If it’s in the map already – the common case in one billion rows – we update the existing pointed-to struct.

It doesn’t help a ton, but it is something: using pointer values in the map takes our time down from 1 minute 45 seconds to 1 minute 31 seconds.

Solution 3: avoid strconv.ParseFloat

My third solution is where things get a bit more hard-core: parse the temperature using custom code instead of strconv.ParseFloat. The standard library function handles a ton of edge cases we don’t need to support for the simple temperatures our input has: 2 or 3 digits in the format 1.2 or 34.5 (and some with a minus sign in front).

Also, strconv.ParseFloat took a string argument, and now that we’re no longer calling that, we can get away with using the byte slice directly from Scanner.Bytes, instead of allocating and copying a string with Scanner.Text.

Now we parse the temperature this way:

negative :=false
index :=0
if tempBytes[index]==’-‘ {
index++
negative=true
}
temp :=float64(tempBytes[index] – ‘0’) // parse first digit
index++
if tempBytes[index] !=’.’ {
temp=temp*10 + float64(tempBytes[index]-‘0’) // parse optional second digit
index++
}
index++ // skip ‘.’
temp +=float64(tempBytes[index]-‘0’) / 10 // parse decimal digit
if negative {
temp=-temp
}

Not pretty, but not exactly rocket science either. This gets our time down from to 1 minute 31 seconds to under a minute: 55.8 seconds.

Solution 4: fixed point integers

Back in the olden days, floating point instructions were a lot slower than integer ones. These days, they’re only a little slower, but it’s probably worth avoiding them if we can.

For this problem, each temperature has a single decimal digit, so it’s easy to use fixed point integers to represent them. For example, we can represent 34.5 as the integer 345. Then at the end, just before we print out the results, we convert them back to floats.

So my fourth solution is basically the same as solution 3, but with the stats struct field as follows:

type stats struct {
min, max, count int32
sum int64
}

Then, when printing out the results, we need to divide by 10:

mean :=float64(s.sum) / float64(s.count) / 10
fmt.Fprintf(output, “%s=%.1f/%.1f/%.1f”,
station, float64(s.min)/10, mean, float64(s.max)/10)

I’ve used 32-bit integers for minimum and maximum temperatures, as the highest we’ll probably have is about 500 (50 degrees Celsius). We could use int16, but from previous experience, modern 64-bit CPUs are slightly slower when handling 16-bit integers than 32-bit ones. In my tests just now it didn’t seem to make a measurable difference, but I opted for 32-bit anyway.

Using integers cut the time down from 55.8 seconds to 51.0 seconds, a small win.

Solution 5: avoid bytes.Cut

To make solution 5 I recorded another profile (of solution 4):

Profile of solution r4

Okay, now it’s getting harder. The map operations dominate, and moving to a custom hash table will be a bit tricky. So will getting rid of the bufio.Scanner. Let’s procrastinate and get rid of the bytes.Cut.

I thought of a simple way to save time. If we look at a line, for example:

It’s going to be faster to parse the temperature from the end and find the ‘;’ there than to scan through the whole station name to look for the ‘;’. This rather ugly code does just that:

end :=len(line)
tenths :=int32(line[end-1] – ‘0’)
ones :=int32(line[end-3] – ‘0’) // line[end-2] is ‘.’
var temp int32
var semicolon int
if line[end-4]==’;’ { // positive N.N temperature
temp=ones*10 + tenths
semicolon=end – 4
} else if line[end-4]==’-‘ { // negative -N.N temperature
temp=-(ones*10 + tenths)
semicolon=end – 5
} else {
tens :=int32(line[end-4] – ‘0’)
if line[end-5]==’;’ { // positive NN.N temperature
temp=tens*100 + ones*10 + tenths
semicolon=end – 5
} else { // negative -NN.N temperature
temp=-(tens*100 + ones*10 + tenths)
semicolon=end – 6
}
}
station :=line[:semicolon]

Avoiding bytes.Cut cut the time down from 51.0 seconds to 46.0 seconds, another small win.

Solution 6: avoid bufio.Scanner

Now we’re going to try to get rid of the bufio.Scanner. If you think about it, to find the end of each line, the scanner has to look through all the bytes to find the newline character. Then we process many of the bytes again to parse the temperature and find the ‘;’. So let’s try to combine those steps and throw bufio.Scanner out the window.

In solution 6 we allocate a 1MB buffer to read the file in large chunks, look for the last newline in the chunk to ensure we’re not chopping a line in half, and then process each chunk. That looks like this:

buf :=make([]byte, 1024*1024)
readStart :=0
for {
n, err :=f.Read(buf[readStart:])
if err !=nil && err !=io.EOF {
return err
}
if readStart+n==0 {
break
}
chunk :=buf[:readStart+n]

newline :=bytes.LastIndexByte(chunk, ‘n’)
if newline=len(items) {
hashIndex=0
}
}
}

readStart=copy(buf, remaining)
}

The payoff for all this code is large: the custom hash table cuts down the time from 41.3 seconds to 25.8s.

Solution 8: process chunks in parallel

In solution 8 I wanted to add some parallelism. However, I thought I’d go back to the simple and idiomatic code from my first solution, which uses bufio.Scanner and strconv.ParseFloat, and parallelise that. Then we can see whether optimising or parallelising gives better results – and in solution 9 we’ll do both.

It’s straight-forward to parallelise a map-reduce problem like this: split the file into similar-sized chunks (one for each CPU core), fire up a thread (in Go, a goroutine) to process each chunk, and then merge the results at the end.

Here’s what that looks like at a high level:

// Determine non-overlapping parts for file split (each part has offset and size).
parts, err :=splitFile(inputPath, maxGoroutines)
if err !=nil {
return err
}

// Start a goroutine to process each part, returning results on a channel.
resultsCh :=make(chan map[string]r8Stats)
for _, part :=range parts {
go r8ProcessPart(inputPath, part.offset, part.size, resultsCh)
}

// Wait for the results to come back in and aggregate them.
totals :=make(map[string]r8Stats)
for i :=0; i
>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : Hacker News – https://benhoyt.com/writings/go-1brc/

Tags: billionChallengetechnology
Previous Post

The ‘banned’ Star Trek episode that promised a united Ireland

Next Post

Dubai to host ‘India by the Creek’, a three-day cultural extravaganza from March 8

China to give $500 million to WHO in next 5 years, official says – Reuters

China to give $500 million to WHO in next 5 years, official says – Reuters

May 21, 2025
Consumers Prop Up the Economy. They’re Showing Signs of Strain. – The New York Times

Consumers Prop Up the Economy. They’re Showing Signs of Strain. – The New York Times

May 21, 2025
Jennifer Lawrence and Robert Pattinson Did This “Humiliating” Thing to Prep for Sex Scenes – Yahoo

Jennifer Lawrence and Robert Pattinson’s Hilarious Preparation for Steamy Sex Scenes!

May 21, 2025
United States Appears Set to Skip World Health Assembly while China Sends Over 180 Delegates to Geneva – Health Policy Watch

United States Appears Set to Skip World Health Assembly while China Sends Over 180 Delegates to Geneva – Health Policy Watch

May 21, 2025
Trump administration working on plan to move 1 million Palestinians to Libya – NBC News

Trump Administration’s Bold Proposal: Relocating 1 Million Palestinians to Libya

May 21, 2025
Airbus Subsidiary Chooses Aeva to Provide 4D LiDAR Technology for Project – Thomasnet

Airbus Subsidiary Partners with Aeva for Cutting-Edge 4D LiDAR Technology in Exciting New Project!

May 21, 2025
Notre Dame hires alum Pat Garrity to trailblazing role overseeing both men’s and women’s basketball teams – CBS Sports

Notre Dame hires alum Pat Garrity to trailblazing role overseeing both men’s and women’s basketball teams – CBS Sports

May 20, 2025
4th Open Forum on Ecology and Preservation of Water Bodies Held at SUSU – South Ural State University

4th Open Forum on Ecology and Preservation of Water Bodies Held at SUSU – South Ural State University

May 20, 2025
Kalley Hurt, a Carnegie High School science teacher, receives a $150 check from representative Mike Hixson – Southwest Ledger

Kalley Hurt, a Carnegie High School science teacher, receives a $150 check from representative Mike Hixson – Southwest Ledger

May 20, 2025
Hospital superbug can feed on medical plastic, first-of-its-kind study reveals – Live Science

Revolutionary Study Uncovers Hospital Superbug’s Ability to Thrive on Medical Plastics!

May 20, 2025

Categories

Archives

May 2025
MTWTFSS
 1234
567891011
12131415161718
19202122232425
262728293031 
« Apr    
Earth-News.info

The Earth News is an independent English-language daily published Website from all around the World News

Browse by Category

  • Business (20,132)
  • Ecology (624)
  • Economy (638)
  • Entertainment (21,552)
  • General (15,224)
  • Health (9,680)
  • Lifestyle (642)
  • News (22,149)
  • People (640)
  • Politics (646)
  • Science (15,861)
  • Sports (21,148)
  • Technology (15,628)
  • World (628)

Recent News

China to give $500 million to WHO in next 5 years, official says – Reuters

China to give $500 million to WHO in next 5 years, official says – Reuters

May 21, 2025
Consumers Prop Up the Economy. They’re Showing Signs of Strain. – The New York Times

Consumers Prop Up the Economy. They’re Showing Signs of Strain. – The New York Times

May 21, 2025
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

Go to mobile version