So I installed python 2.7.8 and the latest pygame, but I can't seem to get image.load () to find my image file, despite several attempts to rename and recheck the spelling. The image and script that use it are in the same directory. Has anyone run into similar issues?
self.src_image = pygame.image.load
Is the line in question. image is used as a parameter that is filled later with a specific file name.
Here is some context:
import pygame, math, sys, os
from pygame.locals import *
screen = pygame.display.set_mode((1024, 786))
clock = pygame.time.Clock()
class WitchSprite(pygame.sprite.Sprite):
speed = 10
acceleration = .4
def __init__(self, image, position):
pygame.sprite.Sprite.__init__(self)
self.src_image = pygame.image.load(os.path.join(image))
self.postion = position
self.speed = self.direction = 0
self.k_left = self.k_right = self.k_down = self.k_up = 0
First, check that your image can even be loaded. Use pygame.image.get_extended() to check if it will load image formats. It should return True.
The pygame docs say that you should "use os.path.join() for compatibility." Try to load your image using os and see if that takes care of your problem.
ex: "asurf = pygame.image.load(os.path.join('data', 'bla.png'))"
Related
I'm using Python 3.9 with Spyder 5.2.2 (Anaconda) for a U-Net segmentation task with MONAI. After importing all the images in a dictionary, I create these lines to define pre-process steps:
import SimpleITK as sitk
from monai.inferers import SimpleInferer
from monai.transforms import (
AsDiscrete,
DataStatsd,
AddChanneld,
Compose,
Activations,
LoadImaged,
Resized,
RandFlipd,
ScaleIntensityRanged,
DataStats,
AsChannelFirstd,
AsDiscreted,
ToTensord,
EnsureTyped,
RepeatChanneld,
EnsureType
)
from monai.transforms import Transform
monai_load = [
LoadImaged(keys=["image","segmentation"],image_only=False,reader=PILReader()),
EnsureTyped(keys=["image", "segmentation"], data_type="numpy"),
AddChanneld(keys=["segmentation","image"]),
RepeatChanneld(keys=["image"],repeats=3),
AsChannelFirstd(keys=["image"], channel_dim = 0),
]
monai_transforms =[
AsDiscreted(keys=["segmentation"],threshold=0.5),
ToTensord(keys=["image","segmentation"]),
]
class N4ITKTransform(Transform):
def __call__(self,image):
filtered = []
for channel in image["image"]:
inputImage = sitk.GetImageFromArray(channel)
inputImage = sitk.Cast(inputImage, sitk.sitkFloat32)
corrector = sitk.N4BiasFieldCorrectionImageFilter()
outputImage = corrector.Execute(inputImage)
filtered.append(sitk.GetArrayFromImage(outputImage))
image["image"] = np.stack(filtered)
return image
train_transforms = Compose(monai_load + [N4ITKTransform()] + monai_transforms)
When i recall these transforms with Compose and apply them to the train images, python does not work on GPU despite
torch.cuda.is_available()
return True.
These are the lines where I apply the transforms:
train_ds = IterableDataset(data = train_data, transform = train_transforms)
train_loader = DataLoader(dataset = train_ds, batch_size = batch_size, num_workers = 0, pin_memory = True)
When I define the U-Net model, I send it to 'cuda'.
The problem is in the SimpleITK transform. If I don't use them, Python works on GPU as usual.
Thank you in advance for getting back to me.
Federico
The answer is simple: SimpleITK uses CPU for processing.
I am not sure whether it is possible to get it to use some of the GPU-accelerated filters from ITK (its base library). If you use ITK Python, you have the possibility to use GPU-filters. But only a few filters have GPU implementations. N4BiasFieldCorrection does NOT have a GPU implementation. So if you want to use this filter, it needs to be done on the CPU.
I'm trying to downsample a point cloud. I have 2 data formats for different parts of my data.
The .bin files cause no problems, but when I'm trying to downsample the .e57 files I encounter a strange problem.
Here's what I do:
import numpy as np
import open3d
pointfile = "path/to/file.e57"
pcd_data = np.fromfile(point_file, dtype=np.float32)
pcd_data = velo_data.reshape(-1, 4)
pcd_points = velo_data[:, :3]
pcd = open3d.geometry.PointCloud()
pcd.points = open3d.utility.Vector3dVector(pcd_points)
pcd_down = pcd.voxel_down_sample(voxel_size=0.8)
res = np.asarray(pcd_down.points)
It works fine for .bin, but when i try the .e57 I get the error:
RuntimeError: [Open3D ERROR] [VoxelDownSample] voxel_size is too small.
No matter if I use voxel_size of 0.005, 0.8, 100, 5000 or 1000000000000000.
I tried the earlier open3d Version:
pcd_down = open3d.geometry.voxel_down_sample(voxel_size=0.8)
and at least it throws no error, but my downsampled pointcloud then contains 0 points (from ~350 000).
As the file should be structured in points with 4 features, the file seems to be read correctly (this works for any of my files), as the reshape works just fine.
Any ideas?
Still have no clue about the original error, but I succesfully worked around the problem by using pye57:
https://github.com/davidcaron/pye57
together with this solution to a possibly occuring problem:
https://github.com/davidcaron/pye57/issues/6#issuecomment-803894677
With this code
import numpy as np
import open3d
import pye57
point_file = "path/to/file.e57"
e57 = pye57.E57(point_file)
data = e57.read_scan_raw(0)
assert isinstance(data["cartesianX"], np.ndarray)
assert isinstance(data["cartesianY"], np.ndarray)
assert isinstance(data["cartesianZ"], np.ndarray)
x = np.array(data["cartesianX"])
y = np.array(data["cartesianY"])
z = np.array(data["cartesianZ"])
pcd_points = np.concatenate((x, y), axis=0)
pcd_points = np.concatenate((pcd_points, z), axis=0)
pcd_points = velo_points.reshape(-1, 3)
pcd = open3d.geometry.PointCloud()
pcd.points = open3d.utility.Vector3dVector(pcd_points)
pcd_down = pcd.voxel_down_sample(voxel_size=0.0035)
I finally get a downsampled point cloud.
I want to load a dataset of grayscale images. I used ImageFolder but this doesn't load gray images by default as it converts images to RGB.
I found solutions that load images with ImageFolder and after convert images in grayscale, using:
transforms.Grayscale(num_output_channels=1)
or
ImageOps.grayscale(image)
Is it correct?
How can I load grayscale imaged without conversion? I try ImageDataBunch, but I have problems to import fastai.vision
Assuming the dataset is stored in the "Dataset" folder as given below, set the root directory as "Dataset":
Dataset
class_1
img1.png
img2.png
class_2
img1.png
img2.png
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader, random_split
from torchvision import transforms
root = 'Dataset/'
data_transform = transforms.Compose([transforms.Grayscale(num_output_channels=1),
transforms.ToTensor()])
dataset = ImageFolder(root, transform=data_transform)
For reference, train and test dataset are being split into 70% and 30% respectively.
# Split test and train dataset
train_size = int(0.7 * len(dataset))
test_size = len(dataset) - train_size
train_data, test_data = random_split(dataset, [train_size, test_size])
This dataset can be further divided into train and test data loaders as given below to perform operation in batches.
Usually you will see the dataset is assigned batch_size once to be used for both train and test loaders. But, I try to define it separately. The idea is to give the batch_size such that it is a factor of the train/test data loader's size, otherwise it will give an error.
# Set batch size of train data loader
batch_size_train = 20
# Set batch size of test data loader
batch_size_test = 22
# load the split train and test data into batches via DataLoader()
train_loader = DataLoader(train_data, batch_size=batch_size_train, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size_test, shuffle=True)
Yes, that is correct and AFAIK pillow by default loads images in RGB, see e.g. answers to this question. So conversion to grayscale is the only way, though takes time of course.
Pure pytorch solution (if ImageFolder isn't appropriate)
You can roll out your own data loading functionalities and If I were you I wouldn't go fastai route as it's pretty high level and takes away control from you (you might not need those functionalities anyway).
In principle, all you have to do is to create something like this below:
import pathlib
import torch
from PIL import Image
class ImageDataset(torch.utils.data.Dataset):
def __init__(self, path: pathlib.Path, images_class: int, regex="*.png"):
self.files = [file for file in path.glob(regex)]
self.images_class: int = images_class
def __getitem__(self, index):
return Image.open(self.files[index]).convert("LA"), self.images_class
# Assuming you have `png` images, can modify that with regex
final_dataset = (
ImageDataset(pathlib.Path("/path/to/dogs/images"), 0)
+ ImageDataset(pathlib.Path("/path/to/cats/images"), 1)
+ ImageDataset(pathlib.Path("/path/to/turtles/images"), 2)
)
Above would get you images from the paths provided above and each image would return appropriate provided class.
This gives you more flexibility (different folder setting than torchvision.datasets.ImageFolder) for a few more lines.
Ofc, you could add more of those or use loop or whatever else.
You could also apply torchvision.transforms, e.g. transforming images above to tensors, read
torchdata solution
Disclaimer, author here. If you are cocerned about loading times of your data and grayscale transformation you could use torchdata third party library for pytorch.
Using it one could create the same thing as above but use cache or map (to use torchvision.transforms or other transformations easily) and some other things known e.g. from tensorflow.data module, see below:
import pathlib
from PIL import Image
import torchdata
# Change inheritance
class ImageDataset(torchdata.Dataset):
def __init__(self, path: pathlib.Path, images_class: int, regex="*.png"):
super().__init__() # And add constructor call and that's it
self.files = [file for file in path.glob(regex)]
self.images_class: int = images_class
def __getitem__(self, index):
return Image.open(self.files[index]), self.images_class
final_dataset = (
ImageDataset(pathlib.Path("/path/to/dogs/images"), 0)
+ ImageDataset(pathlib.Path("/path/to/cats/images"), 1)
+ ImageDataset(pathlib.Path("/path/to/turtles/images"), 2)
).cache() # will cache data in-memory after first pass
# You could apply transformations after caching for possible speed-up
torchvision ImageFolder loader
As correctly pointed out by #jodag in the comments, one can use loader callable with single argument path to do customized data opening, e.g. for grayscale it could be:
from PIL import Image
import torchvision
dataset = torchvision.datasets.ImageFolder(
"/path/to/images", loader=lambda path: Image.open(path).convert("LA")
)
Please notice you could also use it for other types of files, those doesn't have to be images.
Make custom loader, feed it to ImageFolder:
import numpy as np
from PIL import Image, ImageOps
def gray_reader(image_path):
im = Image.open(image_path)
im2 = ImageOps.grayscale(im)
im.close()
return np.array(im2) # return np array
# return im2 # return PIL Image
some_dataset = ImageFolder(image_root_path, loader=gray_reader)
Edit:
Below code is much better than previous, get color image and convert to grayscale in transform()
def get_transformer(h, w):
valid_transform = transforms.Compose([
transforms.ToPILImage(),
transforms.Grayscale(num_output_channels=1),
transforms.Resize((h, w)),
transforms.ToTensor(),
transforms.Normalize([0.5], [0.5]) ])
return valid_transform
I am new to ROS. I need to convert a preexisting video file, or a large amount of images that can be concatenated into a video stream, into a .bag file in ROS. I found this code online: http://answers.ros.org/question/11537/creating-a-bag-file-out-of-a-image-sequence/, but it says it is for camera calibration, so not sure if it fits my purpose.
Could someone with a good knowledge of ROS confirm that I can use the code in the link provided for my purposes, or if anyone actually has the code I'm looking for, could you please post it here?
The following code converts a video file to a bag file, inspired from the code in the link provided.
Little reminder:
this code depends on cv2 (opencv python)
time stamp of ROS message is calculated by frame index and fps. fps will be set to 24 if opencv unable to read it from the video.
import time, sys, os
from ros import rosbag
import roslib, rospy
roslib.load_manifest('sensor_msgs')
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
import cv2
TOPIC = 'camera/image_raw/compressed'
def CreateVideoBag(videopath, bagname):
'''Creates a bag file with a video file'''
bag = rosbag.Bag(bagname, 'w')
cap = cv2.VideoCapture(videopath)
cb = CvBridge()
prop_fps = cap.get(cv2.CAP_PROP_FPS)
if prop_fps != prop_fps or prop_fps <= 1e-2:
print "Warning: can't get FPS. Assuming 24."
prop_fps = 24
ret = True
frame_id = 0
while(ret):
ret, frame = cap.read()
if not ret:
break
stamp = rospy.rostime.Time.from_sec(float(frame_id) / prop_fps)
frame_id += 1
image = cb.cv2_to_compressed_imgmsg(frame)
image.header.stamp = stamp
image.header.frame_id = "camera"
bag.write(TOPIC, image, stamp)
cap.release()
bag.close()
if __name__ == "__main__":
if len( sys.argv ) == 3:
CreateVideoBag(*sys.argv[1:])
else:
print( "Usage: video2bag videofilename bagfilename")
I am building an application to continuously display an image fetched from an IP camera. I have figured out how to fetch the image, and how to also display the image using Tkinter. But I cannot get it to continuously refresh the image. Using Python 2.7+.
Here is the code I have so far.
import urllib2, base64
from PIL import Image,ImageTk
import StringIO
import Tkinter
URL = 'http://myurl.cgi'
USERNAME = 'myusername'
PASSWORD = 'mypassword'
def fetch_image(url,username,password):
# this code works fine
request = urllib2.Request(url)
base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string)
result = urllib2.urlopen(request)
imgresp = result.read()
img = Image.open(StringIO.StringIO(imgresp))
return img
root = Tkinter.Tk()
img = fetch_image(URL,USERNAME,PASSWORD)
tkimg = ImageTk.PhotoImage(img)
Tkinter.Label(root,image=tkimg).pack()
root.mainloop()
How should I edit the code so that the fetch_image is called repeatedly and its output updated in the Tkinter window?
Note that I am not using any button-events to trigger the image refresh, rather it should be refreshed automatically, say, every 1 second.
Here is a solution that uses Tkinter's Tk.after function, which schedules future calls to functions. If you replace everything after your fetch_image definition with the snipped below, you'll get the behavior you described:
root = Tkinter.Tk()
label = Tkinter.Label(root)
label.pack()
img = None
tkimg = [None] # This, or something like it, is necessary because if you do not keep a reference to PhotoImage instances, they get garbage collected.
delay = 500 # in milliseconds
def loopCapture():
print "capturing"
# img = fetch_image(URL,USERNAME,PASSWORD)
img = Image.new('1', (100, 100), 0)
tkimg[0] = ImageTk.PhotoImage(img)
label.config(image=tkimg[0])
root.update_idletasks()
root.after(delay, loopCapture)
loopCapture()
root.mainloop()