Платформа ЦРНП "Мирокод" для разработки проектов https://git.mirocod.ru
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
6543 fdf750e4d4
[Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360)
5 years ago
..
.drone.yml Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
.gitignore Use Go1.11 module (#5743) 6 years ago
.gitmodules Use Go1.11 module (#5743) 6 years ago
.travis.yml [Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360) 5 years ago
AUTHORS Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
CONTRIBUTORS Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
LICENSE Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
LICENSE-2.0.txt Update to last common bleve (#3986) 7 years ago
Makefile Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
README.md [Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360) 5 years ago
arraycontainer.go [Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360) 5 years ago
arraycontainer_gen.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
bitmapcontainer.go [Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360) 5 years ago
bitmapcontainer_gen.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
byte_input.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
clz.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
clz_compat.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
ctz.go Update to last common bleve (#3986) 7 years ago
ctz_compat.go Update to last common bleve (#3986) 7 years ago
fastaggregation.go Update to last common bleve (#3986) 7 years ago
go.mod Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
go.sum Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
manyiterator.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
parallel.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
popcnt.go Update to last common bleve (#3986) 7 years ago
popcnt_amd64.s Update to last common bleve (#3986) 7 years ago
popcnt_asm.go Update to last common bleve (#3986) 7 years ago
popcnt_compat.go Update to last common bleve (#3986) 7 years ago
popcnt_generic.go Update to last common bleve (#3986) 7 years ago
popcnt_slices.go Update to last common bleve (#3986) 7 years ago
priorityqueue.go Update to last common bleve (#3986) 7 years ago
roaring.go [Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360) 5 years ago
roaringarray.go [Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360) 5 years ago
roaringarray_gen.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
runcontainer.go [Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360) 5 years ago
runcontainer_gen.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
serialization.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
serialization_generic.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
serialization_littleendian.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
serializationfuzz.go Update to last common bleve (#3986) 7 years ago
setutil.go [Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360) 5 years ago
shortiterator.go Upgrade blevesearch to v0.8.1 (#9177) 5 years ago
smat.go Update to last common bleve (#3986) 7 years ago
util.go [Vendor] blevesearch v0.8.1 -> v1.0.7 (#11360) 5 years ago

README.md

roaring Build Status Coverage Status GoDoc Go Report Card Build Status

This is a go version of the Roaring bitmap data structure.

Roaring bitmaps are used by several major systems such as Apache Lucene and derivative systems such as Solr and Elasticsearch, Apache Druid (Incubating), LinkedIn Pinot, Netflix Atlas, Apache Spark, OpenSearchServer, Cloud Torrent, Whoosh, Pilosa, Microsoft Visual Studio Team Services (VSTS), and eBay's Apache Kylin.

Roaring bitmaps are found to work well in many important applications:

Use Roaring for bitmap compression whenever possible. Do not use other bitmap compression methods (Wang et al., SIGMOD 2017)

The roaring Go library is used by

This library is used in production in several systems, it is part of the Awesome Go collection.

There are also Java and C/C++ versions. The Java, C, C++ and Go version are binary compatible: e.g, you can save bitmaps from a Java program and load them back in Go, and vice versa. We have a format specification.

This code is licensed under Apache License, Version 2.0 (ASL2.0).

Copyright 2016-... by the authors.

References

  • Daniel Lemire, Owen Kaser, Nathan Kurz, Luca Deri, Chris O'Hara, François Saint-Jacques, Gregory Ssi-Yan-Kai, Roaring Bitmaps: Implementation of an Optimized Software Library, Software: Practice and Experience 48 (4), 2018 arXiv:1709.07821
  • Samy Chambi, Daniel Lemire, Owen Kaser, Robert Godin, Better bitmap performance with Roaring bitmaps, Software: Practice and Experience 46 (5), 2016. http://arxiv.org/abs/1402.6407 This paper used data from http://lemire.me/data/realroaring2014.html
  • Daniel Lemire, Gregory Ssi-Yan-Kai, Owen Kaser, Consistently faster and smaller compressed bitmaps with Roaring, Software: Practice and Experience 46 (11), 2016. http://arxiv.org/abs/1603.06549

Dependencies

Dependencies are fetched automatically by giving the -t flag to go get.

they include

  • github.com/willf/bitset
  • github.com/mschoch/smat
  • github.com/glycerine/go-unsnap-stream
  • github.com/philhofer/fwd
  • github.com/jtolds/gls

Note that the smat library requires Go 1.6 or better.

Installation

  • go get -t github.com/RoaringBitmap/roaring

Example

Here is a simplified but complete example:

package main

import (
    "fmt"
    "github.com/RoaringBitmap/roaring"
    "bytes"
)


func main() {
    // example inspired by https://github.com/fzandona/goroar
    fmt.Println("==roaring==")
    rb1 := roaring.BitmapOf(1, 2, 3, 4, 5, 100, 1000)
    fmt.Println(rb1.String())

    rb2 := roaring.BitmapOf(3, 4, 1000)
    fmt.Println(rb2.String())

    rb3 := roaring.New()
    fmt.Println(rb3.String())

    fmt.Println("Cardinality: ", rb1.GetCardinality())

    fmt.Println("Contains 3? ", rb1.Contains(3))

    rb1.And(rb2)

    rb3.Add(1)
    rb3.Add(5)

    rb3.Or(rb1)

    // computes union of the three bitmaps in parallel using 4 workers  
    roaring.ParOr(4, rb1, rb2, rb3)
    // computes intersection of the three bitmaps in parallel using 4 workers  
    roaring.ParAnd(4, rb1, rb2, rb3)


    // prints 1, 3, 4, 5, 1000
    i := rb3.Iterator()
    for i.HasNext() {
        fmt.Println(i.Next())
    }
    fmt.Println()

    // next we include an example of serialization
    buf := new(bytes.Buffer)
    rb1.WriteTo(buf) // we omit error handling
    newrb:= roaring.New()
    newrb.ReadFrom(buf)
    if rb1.Equals(newrb) {
    	fmt.Println("I wrote the content to a byte stream and read it back.")
    }
    // you can iterate over bitmaps using ReverseIterator(), Iterator, ManyIterator()
}

If you wish to use serialization and handle errors, you might want to consider the following sample of code:

	rb := BitmapOf(1, 2, 3, 4, 5, 100, 1000)
	buf := new(bytes.Buffer)
	size,err:=rb.WriteTo(buf)
	if err != nil {
		t.Errorf("Failed writing")
	}
	newrb:= New()
	size,err=newrb.ReadFrom(buf)
	if err != nil {
		t.Errorf("Failed reading")
	}
	if ! rb.Equals(newrb) {
		t.Errorf("Cannot retrieve serialized version")
	}

Given N integers in [0,x), then the serialized size in bytes of a Roaring bitmap should never exceed this bound:

8 + 9 * ((long)x+65535)/65536 + 2 * N

That is, given a fixed overhead for the universe size (x), Roaring bitmaps never use more than 2 bytes per integer. You can call BoundSerializedSizeInBytes for a more precise estimate.

Documentation

Current documentation is available at http://godoc.org/github.com/RoaringBitmap/roaring

Goroutine safety

In general, it should not generally be considered safe to access the same bitmaps using different goroutines--they are left unsynchronized for performance. Should you want to access a Bitmap from more than one goroutine, you should provide synchronization. Typically this is done by using channels to pass the *Bitmap around (in Go style; so there is only ever one owner), or by using sync.Mutex to serialize operations on Bitmaps.

Coverage

We test our software. For a report on our test coverage, see

https://coveralls.io/github/RoaringBitmap/roaring?branch=master

Benchmark

Type

     go test -bench Benchmark -run -

To run benchmarks on Real Roaring Datasets run the following:

go get github.com/RoaringBitmap/real-roaring-datasets
BENCH_REAL_DATA=1 go test -bench BenchmarkRealData -run -

Iterative use

You can use roaring with gore:

  • go get -u github.com/motemen/gore
  • Make sure that $GOPATH/bin is in your $PATH.
  • go get github.com/RoaringBitmap/roaring
$ gore
gore version 0.2.6  :help for help
gore> :import github.com/RoaringBitmap/roaring
gore> x:=roaring.New()
gore> x.Add(1)
gore> x.String()
"{1}"

Fuzzy testing

You can help us test further the library with fuzzy testing:

     go get github.com/dvyukov/go-fuzz/go-fuzz
     go get github.com/dvyukov/go-fuzz/go-fuzz-build
     go test -tags=gofuzz -run=TestGenerateSmatCorpus
     go-fuzz-build github.com/RoaringBitmap/roaring
     go-fuzz -bin=./roaring-fuzz.zip -workdir=workdir/ -timeout=200

Let it run, and if the # of crashers is > 0, check out the reports in the workdir where you should be able to find the panic goroutine stack traces.

Alternative in Go

There is a Go version wrapping the C/C++ implementation https://github.com/RoaringBitmap/gocroaring

For an alternative implementation in Go, see https://github.com/fzandona/goroar The two versions were written independently.

Mailing list/discussion group

https://groups.google.com/forum/#!forum/roaring-bitmaps