Tasks‎ > ‎BB3‎ > ‎

BB3 corpus

Descriptive statistics of the BB3 corpus

This page gives details on the BB3 datasets, in terms of corpus size and number of annotations.

BB-cat & BB-cat+ner


BB-cat

BB-cat+ner


Train

Dev

Test

Train

Dev

Test

Documents

71

36

54

71

36

54

Words

16,295

8,890

13,797

16,295

8,890

13,933

Bacteria

375

244

347

375

244

401

Habitat

747

454

720

747

454

621

Total entities

1,122

698

1,067

1,122

698

1,022

Bacteria categories

376

245

347

376

245

401

Habitat categories

825

535

861

825

535

681

Total categories

1,201

780

1,208

1,201

780

1,082


BB-event & BB-event+ner


BB-event

BB-event+ner


Train

Dev

Test

Train

Dev

Test

Documents

61

34

51

71

36

54

Words

13,850

8,491

13,039

16,295

8,890

13,933

Bacteria

358

238

336

375

244

401

Habitat

687

454

720

747

454

621

Geographical

35

38

37

36

38

27

Total entities

1,080

730

1,093

1,158

736

1,049

Lives_in events

327

223

340

327

223

314


BB-kb & BB-kb+ner


BB-kb

BB-kb+ner


Train

Dev

Test

Train

Dev

Test

Documents

61

34

50

71

36

54

Words

13,850

8,491

12,758

16,295

8,890

13,933

Bacteria

358

238

330

375

244

401

Habitat

687

454

720

747

454

621

Total entities

1,045

692

1,050

1,122

698

1,022

Bacteria categories

359

239

330

376

245

401

Habitat categories

765

535

861

825

535

681

Total categories

1,124

774

1,191

1,201

780

1,082

Lives_in events

294

186

312

294

186

288