Binary vs one hot

Shawn Halayka · 2023-07-20T20:30:00

I’m making a cat vs dog image classification using PyTorch. Should I have only one output variable, or should I have two (aka one hot)?

Artificial Intelligence Programming binary onehot

Started by taby July 01, 2023 04:56 PM

52 comments, last by taby 1 year, 4 months ago

taby

Author

1,527

July 16, 2023 12:53 AM

Also, should the learning rate be the reciprocal of the number of epochs?

fleabay

1,327

July 16, 2023 01:00 AM

taby said:
Do you do networks in C++?

LOL, I thought you were asking off-topic question about general networking and not neural networks. That's why I ig'ed you.

Anyway, I've just a passing knowledge about machine learning stuff.

🙂🙂🙂🙂🙂<←The tone posse, ready for action.

taby

Author

1,527

July 16, 2023 01:04 AM

That’s ok. I know sockets already. ?

alvaro

21,607

July 16, 2023 01:37 PM

taby said:
Also, should the learning rate be the reciprocal of the number of epochs?

No, I don't think those two things are closely connected. I think it has more to do with what optimizer you use and the size of the network. You will need to experiment a bit to see what works in your situation, but I can tell you what I have learned over the years.

In order to set the initial learning rate, I usually try to start training with a variety of learning rates, only for a few seconds or minutes each. Very large learning rates will result in explosions where weights go to infinity, or you get NaNs. Find the approximate largest learning rate where things don't seem to explode immediately and then divide it by 3 or so for safety. I would then plan to start training with that initial learning rate, so you can make a lot of progress quickly.

In the kind of network that you are using, I believe people tend to use an adaptive optimizer, like Adam. Then a popular learning-rate schedule is to drop it two or three times suddenly during the training. For example, 0.005 for a while (if that's what came out of the previous procedure), then 0.001 for a while, then 0.0001 for a bit at the end, so the final weights have less noise.

There is another style of training where you make the network quite large and then train until it completely memorizes all the labels. Traditional statistics would say that overfitting will lead to bad performance in out-of-sample data. However, if you use a very simple SGD optimizer and keep the learning rate pretty high for the whole training, this can actually perform well in out-of-sample data. There are some theoretical papers explaining the reasons for this, and I can try to find one if you are interested. I don't have any personal experience with this type of training.

taby

Author

1,527

July 16, 2023 04:12 PM

Thank you for your expertise.

taby

Author

1,527

July 16, 2023 09:31 PM

I was thinking of doing x number of networks, all with different random seeds, but same learning rate. Then the very best of those x is used In the next round. It’s worth a try.

taby

Author

1,527

July 17, 2023 01:23 AM

So should it be a race to lower the loss amount?

taby

Author

1,527

July 17, 2023 07:16 PM

Better yet, in the case of a tie with regard to loss, the network is to pick the one with the highest of derivative of loss.

taby

Author

1,527

July 18, 2023 12:45 AM

A basic code for doing network fitness based on loss:

import numpy as np
import math
import cv2
import random
import torch
from torch import flatten
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F

import os.path
from os import path

import time




img_width = 32
num_channels = 3

#num_input_components = img_width*img_width*num_channels
num_output_components = 2

num_epochs = 10
learning_rate = 0.001

max_train_files = 1000

num_recursions = 10
num_child_networks = 5




class Net(torch.nn.Module):

	def __init__(self, num_channels, num_output_components):

		super().__init__()
		self.model = torch.nn.Sequential(
		    #Input = 3 x 32 x 32, Output = 32 x 32 x 32
		    torch.nn.Conv2d(in_channels = num_channels, out_channels = 32, kernel_size = 3, padding = 1), 
		    torch.nn.ReLU(),
		    #Input = 32 x 32 x 32, Output = 32 x 16 x 16
		    torch.nn.MaxPool2d(kernel_size=2),
  
		    #Input = 32 x 16 x 16, Output = 64 x 16 x 16
		    torch.nn.Conv2d(in_channels = 32, out_channels = 64, kernel_size = 3, padding = 1),
		    torch.nn.ReLU(),
		    #Input = 64 x 16 x 16, Output = 64 x 8 x 8
		    torch.nn.MaxPool2d(kernel_size=2),
		      
		    #Input = 64 x 8 x 8, Output = 64 x 8 x 8
		    torch.nn.Conv2d(in_channels = 64, out_channels = 64, kernel_size = 3, padding = 1),
		    torch.nn.ReLU(),
		    #Input = 64 x 8 x 8, Output = 64 x 4 x 4
		    torch.nn.MaxPool2d(kernel_size=2),
  
		    torch.nn.Flatten(),
		    torch.nn.Linear(1024, 256),
		    torch.nn.ReLU(),
		    torch.nn.Linear(256, num_output_components)
		)
  
	def forward(self, x):
		return self.model(x)








class float_image:

	def __init__(self, img):
		self.img = img

class image_type:

	def __init__(self, img_type, float_img):
		self.img_type = img_type
		self.float_img = float_img





def do_network(in_net, batch, ground_truth, num_channels, num_output_components, all_train_files, random_seed, num_epochs):

	if (in_net is None):
		in_net = Net(num_channels, num_output_components)

	net = in_net

	random.seed(random_seed)

	optimizer = torch.optim.Adam(net.parameters(), lr = learning_rate)
	loss_func = torch.nn.MSELoss()

	loss = 0;

	for epoch in range(num_epochs):
		
		random.shuffle(all_train_files)

		count = 0

		for i in all_train_files:

			batch[count] = i.float_img
		
			if i.img_type == 0: # cat

				ground_truth[count][0] = 1
				ground_truth[count][1] = 0
			
			elif i.img_type == 1: # dog
				
				ground_truth[count][0] = 0
				ground_truth[count][1] = 1

			count = count + 1
	
		x = Variable(torch.from_numpy(batch))
		y = Variable(torch.from_numpy(ground_truth))

		prediction = net(x)	 
		loss = loss_func(prediction, y)

		print(epoch, loss)

		optimizer.zero_grad()	 # clear gradients for next train
		loss.backward()		 # backpropagation, compute gradients
		optimizer.step()		# apply gradients
	
	return net, loss






if False: #path.exists('weights_' + str(img_width) + '_' + str(num_epochs) + '.pth'):
	net.load_state_dict(torch.load('weights_' + str(img_width) + '_' + str(num_epochs) + '.pth'))
	print("loaded file successfully")
else:
	print("training...")





	all_train_files = []




	file_count = 0

	path = 'training_set/cats/'
	filenames = next(os.walk(path))[2]

	for f in filenames:

		file_count = file_count + 1
		if file_count >= max_train_files:
			break;

		print(path + f)
		img = cv2.imread(path + f)
		
		if (img is None) == False:

			img = img.astype(np.float32)
			res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
			flat_file = res / 255.0
			flat_file = np.transpose(flat_file, (2, 0, 1))
			all_train_files.append(image_type(0, flat_file))

		else:
			print("image read failure")





	file_count = 0

	path = 'training_set/dogs/'
	filenames = next(os.walk(path))[2]

	for f in filenames:

		file_count = file_count + 1
		if file_count >= max_train_files:
			break;

		print(path + f)
		img = cv2.imread(path + f)
		
		if (img is None) == False:

			img = img.astype(np.float32)
			res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
			flat_file = res / 255.0
			flat_file = np.transpose(flat_file, (2, 0, 1))
			all_train_files.append(image_type(1, flat_file))

		else:
			print("image read failure")




	batch = np.zeros((len(all_train_files), num_channels, img_width, img_width), dtype=np.float32)
	ground_truth = np.zeros((len(all_train_files), num_output_components), dtype=np.float32)	

	curr_net, curr_loss = do_network(None, batch, ground_truth, num_channels, num_output_components, all_train_files, round(time.time()), num_epochs)

	for y in range(num_recursions):
		for x in range(num_child_networks):

			print(y, x)

			net, loss = do_network(curr_net, batch, ground_truth, num_channels, num_output_components, all_train_files, round(time.time()), num_epochs)

			if loss < curr_loss:

				curr_loss = loss
				curr_net = net







#	torch.save(net.state_dict(), 'weights_' + str(img_width) + '_' + str(num_epochs) + '.pth')



path = 'test_set/cats/'
filenames = next(os.walk(path))[2]

cat_count = 0
total_count = 0

for f in filenames:

	img = cv2.imread(path + f)
			
	if (img is None) == False:

		img = img.astype(np.float32)
		res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
		flat_file = res / 255.0
		flat_file = np.transpose(flat_file, (2, 0, 1))

	else:

		print("image read failure")
		continue

	batch = torch.zeros((1, num_channels, img_width, img_width), dtype=torch.float32)
	batch[0] = torch.from_numpy(flat_file)

	prediction = curr_net(Variable(batch))

	if prediction[0][0] > prediction[0][1]:
		cat_count = cat_count + 1

	total_count = total_count + 1

print(cat_count / total_count)
print(total_count)





path = 'test_set/dogs/'
filenames = next(os.walk(path))[2]

dog_count = 0
total_count = 0

for f in filenames:

	img = cv2.imread(path + f)
			
	if (img is None) == False:

		img = img.astype(np.float32)
		res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
		flat_file = res / 255.0
		flat_file = np.transpose(flat_file, (2, 0, 1))

	else:

		print("image read failure")
		continue

	batch = torch.zeros((1, num_channels, img_width, img_width), dtype=torch.float32)
	batch[0] = torch.from_numpy(flat_file)

	prediction = curr_net(Variable(batch))

	if prediction[0][0] < prediction[0][1]:
		dog_count = dog_count + 1

	total_count = total_count + 1

print(dog_count / total_count)
print(total_count)

taby

Author

1,527

July 18, 2023 12:58 AM

With CUDA support enabled:

import numpy as np
import math
import cv2
import random
import torch
from torch import flatten
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F

import os.path
from os import path

import time




img_width = 32
num_channels = 3

#num_input_components = img_width*img_width*num_channels
num_output_components = 2

num_epochs = 10
learning_rate = 0.001

max_train_files = 100000

num_recursions = 10
num_child_networks = 5


class Net(torch.nn.Module):

	def __init__(self, num_channels, num_output_components):

		super().__init__()
		self.model = torch.nn.Sequential(
		    #Input = 3 x 32 x 32, Output = 32 x 32 x 32
		    torch.nn.Conv2d(in_channels = num_channels, out_channels = 32, kernel_size = 3, padding = 1), 
		    torch.nn.ReLU(),
		    #Input = 32 x 32 x 32, Output = 32 x 16 x 16
		    torch.nn.MaxPool2d(kernel_size=2),
  
		    #Input = 32 x 16 x 16, Output = 64 x 16 x 16
		    torch.nn.Conv2d(in_channels = 32, out_channels = 64, kernel_size = 3, padding = 1),
		    torch.nn.ReLU(),
		    #Input = 64 x 16 x 16, Output = 64 x 8 x 8
		    torch.nn.MaxPool2d(kernel_size=2),
		      
		    #Input = 64 x 8 x 8, Output = 64 x 8 x 8
		    torch.nn.Conv2d(in_channels = 64, out_channels = 64, kernel_size = 3, padding = 1),
		    torch.nn.ReLU(),
		    #Input = 64 x 8 x 8, Output = 64 x 4 x 4
		    torch.nn.MaxPool2d(kernel_size=2),
  
		    torch.nn.Flatten(),
		    torch.nn.Linear(1024, 256),
		    torch.nn.ReLU(),
		    torch.nn.Linear(256, num_output_components)
		)
  
	def forward(self, x):
		return self.model(x)








class float_image:

	def __init__(self, img):
		self.img = img

class image_type:

	def __init__(self, img_type, float_img):
		self.img_type = img_type
		self.float_img = float_img





def do_network(in_net, batch, ground_truth, num_channels, num_output_components, all_train_files, random_seed, num_epochs):

	if (in_net is None):
		in_net = Net(num_channels, num_output_components)

	net = in_net

	random.seed(random_seed)

	optimizer = torch.optim.Adam(net.parameters(), lr = learning_rate)
	loss_func = torch.nn.MSELoss()

	loss = 0;

	net.to(torch.device("cuda:0"))

	for epoch in range(num_epochs):
		
		random.shuffle(all_train_files)

		count = 0

		for i in all_train_files:

			batch[count] = i.float_img
		
			if i.img_type == 0: # cat

				ground_truth[count][0] = 1
				ground_truth[count][1] = 0
			
			elif i.img_type == 1: # dog
				
				ground_truth[count][0] = 0
				ground_truth[count][1] = 1

			count = count + 1
	
		x = Variable(torch.from_numpy(batch))
		y = Variable(torch.from_numpy(ground_truth))
		x = x.to(torch.device("cuda:0"))
		y = y.to(torch.device("cuda:0"))


		prediction = net(x)	 
		loss = loss_func(prediction, y)

		print(epoch, loss)

		optimizer.zero_grad()	 # clear gradients for next train
		loss.backward()		 # backpropagation, compute gradients
		optimizer.step()		# apply gradients
	
	return net, loss






if False: #path.exists('weights_' + str(img_width) + '_' + str(num_epochs) + '.pth'):
	net.load_state_dict(torch.load('weights_' + str(img_width) + '_' + str(num_epochs) + '.pth'))
	print("loaded file successfully")
else:
	print("training...")


	if torch.cuda.is_available():
		dev = "cuda:0"
	else: 
		dev = "cpu"

	device = torch.device(dev)

	print(device)




	all_train_files = []




	file_count = 0

	path = 'training_set/cats/'
	filenames = next(os.walk(path))[2]

	for f in filenames:

		file_count = file_count + 1
		if file_count >= max_train_files:
			break;

		print(path + f)
		img = cv2.imread(path + f)
		
		if (img is None) == False:

			img = img.astype(np.float32)
			res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
			flat_file = res / 255.0
			flat_file = np.transpose(flat_file, (2, 0, 1))
			all_train_files.append(image_type(0, flat_file))

		else:
			print("image read failure")





	file_count = 0

	path = 'training_set/dogs/'
	filenames = next(os.walk(path))[2]

	for f in filenames:

		file_count = file_count + 1
		if file_count >= max_train_files:
			break;

		print(path + f)
		img = cv2.imread(path + f)
		
		if (img is None) == False:

			img = img.astype(np.float32)
			res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
			flat_file = res / 255.0
			flat_file = np.transpose(flat_file, (2, 0, 1))
			all_train_files.append(image_type(1, flat_file))

		else:
			print("image read failure")




	batch = np.zeros((len(all_train_files), num_channels, img_width, img_width), dtype=np.float32)
	ground_truth = np.zeros((len(all_train_files), num_output_components), dtype=np.float32)	

	curr_net, curr_loss = do_network(None, batch, ground_truth, num_channels, num_output_components, all_train_files, round(time.time()), num_epochs)

	for y in range(num_recursions):
		for x in range(num_child_networks):

			print(y, x)

			net, loss = do_network(curr_net, batch, ground_truth, num_channels, num_output_components, all_train_files, round(time.time()), num_epochs)

			if loss < curr_loss:

				curr_loss = loss
				curr_net = net







#	torch.save(net.state_dict(), 'weights_' + str(img_width) + '_' + str(num_epochs) + '.pth')



path = 'test_set/cats/'
filenames = next(os.walk(path))[2]

cat_count = 0
total_count = 0

for f in filenames:

	img = cv2.imread(path + f)
			
	if (img is None) == False:

		img = img.astype(np.float32)
		res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
		flat_file = res / 255.0
		flat_file = np.transpose(flat_file, (2, 0, 1))

	else:

		print("image read failure")
		continue

	batch = torch.zeros((1, num_channels, img_width, img_width), dtype=torch.float32)
	batch[0] = torch.from_numpy(flat_file)

	prediction = curr_net(Variable(batch))

	if prediction[0][0] > prediction[0][1]:
		cat_count = cat_count + 1

	total_count = total_count + 1

print(cat_count / total_count)
print(total_count)





path = 'test_set/dogs/'
filenames = next(os.walk(path))[2]

dog_count = 0
total_count = 0

for f in filenames:

	img = cv2.imread(path + f)
			
	if (img is None) == False:

		img = img.astype(np.float32)
		res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
		flat_file = res / 255.0
		flat_file = np.transpose(flat_file, (2, 0, 1))

	else:

		print("image read failure")
		continue

	batch = torch.zeros((1, num_channels, img_width, img_width), dtype=torch.float32)
	batch[0] = torch.from_numpy(flat_file)

	prediction = curr_net(Variable(batch))

	if prediction[0][0] < prediction[0][1]:
		dog_count = dog_count + 1

	total_count = total_count + 1

print(dog_count / total_count)
print(total_count)

Binary vs one hot

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Binary vs one hot

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines