CIFAR系列_cifar100 has 100 classes containing 600 images eac-程序员宅基地

C I F A R 系 列 CIFAR系列 CIFAR

官网:http://www.cs.toronto.edu/~kriz/cifar.html


链接:https://pan.baidu.com/s/1l1LZqs7n48OlGIciErcFWg
提取码:1234


一 CIFAR10

import pickle
import cv2
import numpy as np
import os
# 1.确定文件路径
file_data_batch_1 = './cifar10/data_batch_1'
file_data_batch_2 = './cifar10/data_batch_2'
file_data_batch_3 = './cifar10/data_batch_3'
file_data_batch_4 = './cifar10/data_batch_4'
file_data_batch_5 = './cifar10/data_batch_5'
file_batches_meta = './cifar10/batches.meta'
file_test_batch = './cifar10/test_batch'

# 2.将数据文件转为dict
def unpickle(file):  # 该函数将cifar10提供的文件读取到python的数据结构(字典)中
    fo = open(file, 'rb')
    dict = pickle.load(fo, encoding='iso-8859-1')
    fo.close()
    return dict

# 3.查看训练集
dict_train_batch1 = unpickle(file_data_batch_1)  # 将data_batch文件读入到数据结构(字典)中
print("*********字典的4组键值对**************")
print(dict_train_batch1.keys())  # 字典里有4组键值对
print("*********dict_train_batch1**************")
print(dict_train_batch1)  # 每个batch是一个字典

'''
训练集的dict有四组值:

1.batch_label:表示是第几个训练集
2.labels:每张训练图片对应的label,data每一行对应的标签(数字0-9),是个一维数组,10000个元素
3.data:训练集数据(数据在0-255之间),32*32图片的数值化数组,是一个10000*3072的numpy二维数组, 每一行代表一张图片,一行分3段(红r绿g蓝b色道),每段1024个元素
4.filenames:每张训练图片数据的名字(png格式), data每一行对应的文件名,同是一个一维数组,10000个元素

'''

print('---------------------data--------------------------')
data_train_batch1 = dict_train_batch1.get('data')  # 字典中取data
print(data_train_batch1)
print(data_train_batch1.shape)
print('---------------------labels--------------------------')
labels = dict_train_batch1.get('labels')  # 字典中取labels
print(labels)
print(len(labels))
print('-----------------------filenames------------------------')
filenames = dict_train_batch1.get('filenames')  # 字典中取filenames
print(filenames)
print(len(filenames))


# 4.查看测试集
dict_test_batch = unpickle(file_test_batch)
print('------------------dict_test_batch-----------------------------')
print(dict_test_batch.keys())

# 5.查看Meta
dict_meta_batch = unpickle(file_batches_meta)
print('------------------dict_meta_batch-----------------------------')
print(dict_meta_batch)


# 6.npy转img
'''
1.npy转三通道img
2.img存到对应的label文件夹
'''
def makedir(path):
    # 判断路径是否存在
    isExists = os.path.exists(path)
    if not isExists:
        # 如果不存在,则创建目录(多层)
        os.makedirs(path)
        print('目录创建成功!')
        # return True
    # else:
    #     # 如果目录存在则不创建,并提示目录已存在
    #     # print('目录已存在!')
    #     return False

# 6.1 确定所有数据集的文件路径
file_data_batch_list = [file_data_batch_1,file_data_batch_2,file_data_batch_3,file_data_batch_4,file_data_batch_5,file_test_batch]

# 6.2 npy转三通道img,并将img存到对应的label文件夹
for i in range(6):
    dict_train_batch = unpickle(file_data_batch_list[i])
    data_train_batch = dict_train_batch.get('data')  # 字典中取data
    labels = dict_train_batch.get('labels')  # 字典中取labels
    for j in range(10000):
        matrix = np.reshape(data_train_batch[j], (3, 32, 32))
        matrix = matrix.transpose(1, 2, 0)  # rgb --> bgr
        label = labels[j]
        makedir("./image/"+str(label)) # 创建保存图片的文件夹
        cv2.imwrite("./image/"+str(label)+"/"+str(i)+"_"+str(j)+".png", matrix) # 保存图像

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000
randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

CIFAR10有6w张32*32的图片,一共有10个类别,每个类别6000张,5w张训练,1w张测试。

数据集实际被分为6batches,5份训练,1份测试,每份均为1w张。测试集的1w张,是随机从10个类别中分别抽取的1000张,类别完全均衡。5份训练集中,可能某份内,一个类别的数据比另一个类别少或多,但是整体5份里面各类数据的总和是5000份。

10个类别如下:
在这里插入图片描述


从官网下载的数据已经不是原始图片啦,而是经过数值化的numpy数组


数据读取

在这里插入图片描述

1.确定文件路径

file_data_batch_1 = '.\\major_dataset_repo\\cifar10\\data_batch_1'
file_data_batch_2 = '.\\major_dataset_repo\\cifar10\\data_batch_2'
file_data_batch_3 = '.\\major_dataset_repo\\cifar10\\data_batch_3'
file_data_batch_4 = '.\\major_dataset_repo\\cifar10\\data_batch_4'
file_data_batch_5 = '.\\major_dataset_repo\\cifar10\\data_batch_5'
file_batches_meta = '.\\major_dataset_repo\\cifar10\\batches.meta'
file_test_batch = '.\\major_dataset_repo\\cifar10\\test_batch'

2.将数据文件转为dict

def unpickle(file):  # 该函数将cifar10提供的文件读取到python的数据结构(字典)中
    import pickle
    fo = open(file, 'rb')
    dict = pickle.load(fo, encoding='iso-8859-1')
    fo.close()
    return dict

3.查看训练集

dict_train_batch1 = unpickle(file_data_batch_1)  # 将data_batch文件读入到数据结构(字典)中
print(dict_train_batch1.keys())  # 字典里有4组键值对

dict_keys([‘batch_label’, ‘labels’, ‘data’, ‘filenames’])
在这里插入图片描述

print(dict_train_batch1)  # 每个batch是一个字典

在这里插入图片描述

训练集的dict有四组值

  • batch_label:表示是第几个训练集
  • labels:每张训练图片对应的label,data每一行对应的标签(数字0-9),是个一维数组,10000个元素
  • data:训练集数据(数据在0-255之间),32*32图片的数值化数组,是一个10000*3072的numpy二维数组, 每一行代表一张图片,一行分3段(红绿蓝色道),每段1024个元素
  • filenames:每张训练图片数据的名字(png格式), data每一行对应的文件名,同是一个一维数组,10000个元素
print('-----------------------------------------------')

data_train_batch1 = dict_train_batch1.get('data')  # 字典中取data
print(data_train_batch1)
print(data_train_batch1.shape)
print('-----------------------------------------------')

labels = dict_train_batch1.get('labels')  # 字典中取labels
print(labels)
print(len(labels))
print('-----------------------------------------------')

filenames = dict_train_batch1.get('filenames')  # 字典中取filenames
print(filenames)
print(len(filenames))
print('-----------------------------------------------')


在这里插入图片描述


查看测试集

dict_test_batch = unpickle(file_test_batch)
print(dict_test_batch.keys())

在这里插入图片描述
其他同训练集一样


查看Meta

dict_meta_batch = unpickle(file_batches_meta)
print(dict_meta_batch)

{‘num_cases_per_batch’: 10000, ‘label_names’: [‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’], ‘num_vis’: 3072}

在这里插入图片描述

二 CTFAR100

从官网下载的数据已经不是原始图片啦,而是经过数值化的numpy数组

在这里插入图片描述

it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class.

The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a “fine” label (the class to which it belongs) and a “coarse” label (the superclass to which it belongs).

100个类别如下

Superclass Classes
aquatic mammals beaver, dolphin, otter, seal, whale
fish aquarium fish, flatfish, ray, shark, trout
flowers orchids, poppies, roses, sunflowers, tulips
food containers bottles, bowls, cans, cups, plates
fruit and vegetables apples, mushrooms, oranges, pears, sweet peppers
household electrical devices clock, computer keyboard, lamp, telephone, television
household furniture bed, chair, couch, table, wardrobe
insects bee, beetle, butterfly, caterpillar, cockroach
large carnivores bear, leopard, lion, tiger, wolf
large man-made outdoor things bridge, castle, house, road, skyscraper
large natural outdoor scenes cloud, forest, mountain, plain, sea
large omnivores and herbivores camel, cattle, chimpanzee, elephant, kangaroo
medium-sized mammals fox, porcupine, possum, raccoon, skunk
non-insect invertebrates crab, lobster, snail, spider, worm
people baby, boy, girl, man, woman
reptiles crocodile, dinosaur, lizard, snake, turtle
small mammals hamster, mouse, rabbit, shrew, squirrel
trees maple, oak, palm, pine, willow
vehicles 1 bicycle, bus, motorcycle, pickup truck, train
vehicles 2 lawn-mower, rocket, streetcar, tank, tractor

1.确定文件路径

file_train = '.\\major_dataset_repo\\cifar100\\train'
file_test = '.\\major_dataset_repo\\cifar100\\test'
file_meta = '.\\major_dataset_repo\\cifar100\\meta'

2.将数据文件转为dict

def unpickle(file):  # 该函数将cifar10提供的文件读取到python的数据结构(字典)中
    import pickle
    fo = open(file, 'rb')
    dict = pickle.load(fo, encoding='iso-8859-1')
    fo.close()
    return dict

3.查看训练集

dict_train = unpickle(file_train)  # 将data_batch文件读入到数据结构(字典)中
print(dict_train.keys())  # 字典里有4组键值对

dict_keys([‘filenames’, ‘batch_label’, ‘fine_labels’, ‘coarse_labels’, ‘data’])

在这里插入图片描述

训练集的dict有四组值

  • filenames:
  • batch_label:
  • fine_labels:
  • coarse_labels:
  • data:
print('-----------------------------------------------')
filenames = dict_train.get('filenames')  # 字典中取filenames
print(filenames)
print('-----------------------------------------------')
batch_label = dict_train.get('batch_label')  # 字典中取batch_label
print(batch_label)
print('-----------------------------------------------')

fine_labels = dict_train.get('fine_labels')  # 字典中取fine_labels
print(fine_labels)
print(len(fine_labels))
print('-----------------------------------------------')

coarse_labels = dict_train.get('coarse_labels')  # 字典中取coarse_labels
print(coarse_labels)
print(len(coarse_labels))
print('-----------------------------------------------')

data = dict_train.get('data')  # 字典中取data
print(data)
print(data.shape)
print('-----------------------------------------------')

在这里插入图片描述


查看测试集

dict_test = unpickle(file_test)
print(dict_test.keys())

在这里插入图片描述

其他同训练集一样


查看Meta

dict_meta_batch = unpickle(file_batches_meta)
print(dict_meta_batch)

{

‘fine_label_names’:

[‘apple’, ‘aquarium_fish’, ‘baby’, ‘bear’, ‘beaver’, ‘bed’, ‘bee’, ‘beetle’, ‘bicycle’, ‘bottle’, ‘bowl’, ‘boy’, ‘bridge’, ‘bus’, ‘butterfly’, ‘camel’, ‘can’, ‘castle’, ‘caterpillar’, ‘cattle’, ‘chair’, ‘chimpanzee’, ‘clock’, ‘cloud’, ‘cockroach’, ‘couch’, ‘crab’, ‘crocodile’, ‘cup’, ‘dinosaur’, ‘dolphin’, ‘elephant’, ‘flatfish’, ‘forest’, ‘fox’, ‘girl’, ‘hamster’, ‘house’, ‘kangaroo’, ‘keyboard’, ‘lamp’, ‘lawn_mower’, ‘leopard’, ‘lion’, ‘lizard’, ‘lobster’, ‘man’, ‘maple_tree’, ‘motorcycle’, ‘mountain’, ‘mouse’, ‘mushroom’, ‘oak_tree’, ‘orange’, ‘orchid’, ‘otter’, ‘palm_tree’, ‘pear’, ‘pickup_truck’, ‘pine_tree’, ‘plain’, ‘plate’, ‘poppy’, ‘porcupine’, ‘possum’, ‘rabbit’, ‘raccoon’, ‘ray’, ‘road’, ‘rocket’, ‘rose’, ‘sea’, ‘seal’, ‘shark’, ‘shrew’, ‘skunk’, ‘skyscraper’, ‘snail’, ‘snake’, ‘spider’, ‘squirrel’, ‘streetcar’, ‘sunflower’, ‘sweet_pepper’, ‘table’, ‘tank’, ‘telephone’, ‘television’, ‘tiger’, ‘tractor’, ‘train’, ‘trout’, ‘tulip’, ‘turtle’, ‘wardrobe’, ‘whale’, ‘willow_tree’, ‘wolf’, ‘woman’, ‘worm’],

‘coarse_label_names’:

[‘aquatic_mammals’, ‘fish’, ‘flowers’, ‘food_containers’, ‘fruit_and_vegetables’, ‘household_electrical_devices’, ‘household_furniture’, ‘insects’, ‘large_carnivores’, ‘large_man-made_outdoor_things’, ‘large_natural_outdoor_scenes’, ‘large_omnivores_and_herbivores’, ‘medium_mammals’, ‘non-insect_invertebrates’, ‘people’, ‘reptiles’, ‘small_mammals’, ‘trees’, ‘vehicles_1’, ‘vehicles_2’]

}


c i f a r 10 : n p y 转 i m g cifar10:npy转img cifar10npyimg

每个图片的npy根据label存到每个label文件夹中

  • 1.npy转三通道img
  • 2.img存到对应的label文件夹
import cv2
import numpy as np

1.确定文件路径

file_data_batch_1 = './cifar10/data_batch_1'
file_data_batch_2 = './cifar10/data_batch_2'
file_data_batch_3 = './cifar10/data_batch_3'
file_data_batch_4 = './cifar10/data_batch_4'
file_data_batch_5 = './cifar10/data_batch_5'
file_batches_meta = './cifar10/batches.meta'
file_test_batch = './cifar10/test_batch'

file_data_batch_list = [file_data_batch_1,file_data_batch_2,file_data_batch_3,file_data_batch_4,file_data_batch_5]

2.将数据文件转为dict

def unpickle(file):  # 该函数将cifar10提供的文件读取到python的数据结构(字典)中
    import pickle
    fo = open(file, 'rb')
    dict = pickle.load(fo, encoding='iso-8859-1')
    fo.close()
    return dict

3.查看训练集

dict_train_batch1 = unpickle("./cifar10/data_batch_1")  # 将data_batch文件读入到数据结构(字典)中
print(dict_train_batch1.keys())  # 字典里有4组键值对

data_train_batch1 = dict_train_batch1.get('data')  # 字典中取data
#print(data_train_batch1)
print(data_train_batch1.shape)
print('-----------------------------------------------')

labels = dict_train_batch1.get('labels')  # 字典中取labels
#print(labels)
print(len(labels))
print('-----------------------------------------------')

4.npy转三通道img

for i in range(5):
    dict_train_batch = unpickle(file_data_batch_list[i])
    data_train_batch = dict_train_batch.get('data')  # 字典中取data
    labels = dict_train_batch.get('labels')  # 字典中取labels
    for j in range(10000):
        matrix = np.reshape(data_train_batch[j], (3, 32, 32))
        matrix = matrix.transpose(1, 2, 0)
        label = labels[j]
        cv2.imwrite("./image/"+str(label)+"/"+str(i)+"_"+str(j)+".png", matrix)

测试

import cv2
import numpy as np
matrix = np.reshape(data_train_batch1[0], (3, 32, 32))
matrix = matrix.transpose(1, 2, 0)
cv2.imwrite("img_test_show.png", matrix)
img = cv2.imread("img_test_show.png")
cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/qq_41375318/article/details/112133932

智能推荐

猫儿PDF-Word格式转换经验谈_pdf转换成分word之后叠加怎么办?-程序员宅基地

文章浏览阅读1.2k次。猫儿我最近工作遇到了万恶的格式转换问题PDF-Word。众位看官抢着说“这不是很简单嘛?”“你就用XX软件就可以了嘛!”事实果真如此么?今天准备了10款软件和1份PDF文档进行测评。 1.ABBYY FineReader11一个知名的俄罗斯ORC软件,拥有高超的识别技术。OCR (Optical Character Recognition,光学字符识别)是指电子设备(例如扫描仪或数_pdf转换成分word之后叠加怎么办?

hdu 3452 Bonsai(最小割)-程序员宅基地

文章浏览阅读912次。题意:给出一棵树,每条边又

Docker将从Kubernetes中移除,我该怎么办?_docker 删除kuboard-程序员宅基地

文章浏览阅读618次。文章目录Docker将从Kubernetes中移除,我该怎么办?对开发者而言对K8S管理员而言是真的吗?但是为什么 Docker 要被移除呢?CRI runtimescontainerdCRI-O还有一件事...CRI runtimesOCI runtimes附录一:runC 是如何工作的![在这里插入图片描述](https://img-blog.csdnimg.cn/20210429154504847.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5na_docker 删除kuboard

英特尔发布重大技术架构的改变和创新,面向CPU、GPU和IPU_处理器架构从哪些方面改进-程序员宅基地

文章浏览阅读4.2k次。在2021年英特尔架构日上,英特尔公司高级副总裁兼加速计算系统和图形事业部总经理Raja Koduri携手多位英特尔架构师,全面介绍了两种全新x86内核架构的详情;英特尔首个性能混合架构,代号“Alder Lake”,以及智能的英特尔硬件线程调度器;专为数据中心设计的下一代英特尔至强可扩展处理器Sapphire Rapids;基础设施处理器(IPU);即将推出的显卡架构,包括Xe HPG微架构和Xe HPC微架构,以及Alchemist SoC, Ponte Vecchio SoC。这些新架构将为._处理器架构从哪些方面改进

PointConv:基于3D点云的深度卷积网络_点云不能用卷积网络吗-程序员宅基地

文章浏览阅读3.8k次。原文首发于微信公众号「3D视觉工坊」——PointConv:基于3D点云的深度卷积网络本文出自知乎:https://zhuanlan.zhihu.com/p/69597887?utm_source=wechat_session&utm_medium=social&utm_oi=1135649954939883520原文:PointConv: Deep Convol..._点云不能用卷积网络吗

遥感影像镶嵌拼接_qmosaic-程序员宅基地

文章浏览阅读8.6k次,点赞3次,收藏34次。软件下载地址:https://pan.baidu.com/s/1dt0yDUsAork9LnLnZTwCgw需要百度网盘下载密码,留言邮箱地址。或联系联系QQ:1257396288​首先加载需要处理的影像,如下所示:然后点击菜单栏中的“生成镶嵌线”,具体参数设置如下:1.自动生成镶嵌线点击“生成镶嵌线”按钮,弹出如下对话框:处理方式:分..._qmosaic

随便推点

ios底部安全距离-程序员宅基地

文章浏览阅读6.9k次,点赞4次,收藏14次。背景: 目前公司开发商城小程序,对于iOS要设置底部安全距离,否则底部会被黑条遮挡技术方案:1、苹果官方推荐:使用env(),constant()来适配,env()和constant(),是IOS11新增特性,Webkit的css函数,用于设定安全区域与边界的距离,有4个预定义变量: safe-area-inset-left:安全区域距离左边边界的距离 safe-area-inset-right:安全区域距离右边边界的距离 safe-area...

第十五课.马尔科夫链蒙特卡洛方法_马尔可夫链是哪门课-程序员宅基地

文章浏览阅读438次。目录M-H采样Metropolis-Hastings采样原理M-H采样步骤Gibbs方法Gibbs核心流程Gibbs采样的合理性证明Gibbs采样实验在第十四课中讲述了马尔科夫链与其稳态的性质,本篇讨论基于马尔科夫链蒙特卡洛(MCMC)方法的采样。M-H采样Metropolis-Hastings采样原理我们的目标分布是p(z)p(z)p(z),同时我们手里有一个便于随时间进行遍历的马尔科夫链,其状态转移矩阵为QQQ。为了便于在马尔科夫链上随时间进行状态转移,这里的矩阵QQQ设计为:Qij=P(x_马尔可夫链是哪门课

mac os 快捷键_mac扩展模式快捷键 site:blog.csdn.net-程序员宅基地

文章浏览阅读550次。要使用键盘快捷键或组合键,您可以同时按修饰键和字符键。例如,同时按下 Command 键(标有 符号的按键)和“c”键会将当前选中的任何内容(文本、图形等等)拷贝至夹纸板。这也称作 Command-C 组合键(或键盘快捷键)。许多组合键中都包含修饰键。修饰键将改变 Mac OS X 对其他按键或鼠标点按动作的解释方式。修饰键包括 Command、Control、Option、Shif_mac扩展模式快捷键 site:blog.csdn.net

mysql collation 选哪个_mysql新建数据库时的collation选择(转)-程序员宅基地

文章浏览阅读649次。转自:https://www.cnblogs.com/sonofelice/p/6432986.htmlmysql新建数据库时的collation选择(转)转自别处的文章。末尾附原文链接mysql的collation大致的意思就是字符序。首先字符本来是不分大小的,那么对字符的>, = , < 操作就需要有个字符序的规则。collation做的就是这个事情,你可以对表进行字符序的设置,也..._mysql collate选择那种号

2020-08-24_"</span><span dir=\"ltr\" style=\"font-style: norm-程序员宅基地

文章浏览阅读139次。1.1基本概念CSS(Cascading Style Sheet) 层叠样式表,为了实现页面内容和表现形式的分离。层叠的含义是可以对一个元素多次设置样式,最后的结果是多次样式叠加的结果,如果有冲突,以后面的样式为准。1.2 基本语法选择器{属性名1:属性值1;属性名2:属性值2;属性名3:属性值3;……}选择器说明该样式施加于哪些元素;属性名和属性值说明是样式内容;一般一行定义一条样式,当然也可以写在一行上,但每条样式都序号加上“;”推荐用小写命名。<_"

【观察】解读Kyligence智能数据云战略,打造新一代数据管理“底座”-程序员宅基地

文章浏览阅读249次。申耀的科技观察读懂科技,赢取未来!众所周知,随着云计算、大数据、人工智能、物联网等新技术在各行各业更加广泛的普及与应用,在催生越来越多数据量产生的同时,也让数据的管理和价值挖掘变得愈加复杂..._kyligence

推荐文章

热门文章

相关标签