如何在 keras 中为 csv 使用 flow_from_directory

数据挖掘 Python 喀拉斯 图像分类 美国有线电视新闻网
2022-03-07 14:41:06

flow_from_directory在 Keras 中要求图像位于不同的子目录中。但是,我将图像放在一个目录中,其中包含一个指定图像名称和目标类的 csv 文件。

如何直接从名为 train.csv 和 test.csv 的 csv 文件使用目录中的流?

3个回答

像这样的事情应该做的工作

当你在做一些新的事情时,很可能会出错。 使用风险自负,或者在样本上尝试并在单独的目录中完全尝试

import pandas as pd
import os
import numpy as np
import shutil

# source is the current directory
# Open dataset file
dataset = pd.read_csv('dataset.csv')
file_names = list(dataset['filenames'].values)
img_labels = list(dataset['labels'].values)

folders_to_be_created = np.unique(list(dataset['labels'])).values

source = os.getcwd()

for new_path in folders_to_be_created:
    if not os.path.exists(".//" + new_path):
        os.makedirs(new_path)

## Be sure that there is nothing else in your directory except the data, csv and the code file, IT's Better to only have your data in that directory and reference the CSV file from a different Directory...

folders = folders_to_be_created.copy()

for f in range(len(file_names)):

  current_img = file_names[f]
  current_label = img_labels[f]

   ## **Check this Line Accordingly** 

  shutil.move("path//to//current//file", "path//to//new//destination//folder//current_label//")

使用flow_from_dataframe

import pandas as pd
data = pd.read_csv("filename.csv") 
df["category"] = df["category"].replace({0: 'cat', 1: 'dog'}) 
train_df, validate_df = train_test_split(df, test_size=0.20, random_state=42)
train_df = train_df.reset_index(drop=True)
validate_df = validate_df.reset_index(drop=True)

train_datagen = ImageDataGenerator(
    rotation_range=15,
    rescale=1./255,
    shear_range=0.1,
    zoom_range=0.2,
    horizontal_flip=True,
    width_shift_range=0.1,
    height_shift_range=0.1
)

train_generator = train_datagen.flow_from_dataframe(
    train_df, 
    "../input/train/train/", 
    x_col='filename',
    y_col='category',
    target_size=IMAGE_SIZE,
    class_mode='categorical',
    batch_size=batch_size
)

试试这个:

training_set = train_datagen.flow_from_directory(training_path,
                                             target_size = (64, 64),
                                             batch_size = 32,
                                             class_mode = 'binary')


imgs, labels = next(training_set)