Newest 'gpu' Questions - Stack Overflow

Questions tagged [gpu]

Acronym for "Graphics Processing Unit". For programming traditional graphical applications, see the tag entry for "graphics programming". For general-purpose programming using GPUs, see the tag entry for "gpgpu". For specific GPU programming technologies, see the popular tag entries for "opencl", "cuda" and "thrust".

0
votes
0answers
16 views

Can I compile OpenCV with CUDA support on a machine without CUDA Dev Tools installed?

I would like to test the performance of GPU-based computation on a remote Windows 2016 machine. For that I built OpenCV 2.4.13 on my local development machine with build option 'WITH_CUDA = true'. But ...
-6
votes
0answers
35 views

How to get GPU Memory Type for Intel Cards [on hold]

I want to detect GPU memory type (for example DDR4)for Intel GPU. GPU Intel Memory Type by GPU-Z
0
votes
0answers
6 views

Keras 2.2.4 runs slowly with backend cntk-gpu version 2.5.1

everyone. I have a problem when I use Keras with backend cntk-gpu version 2.5.1. For some reason, I have to use cntk-gpu 2.5.1 as Keras backend and I have a piece of code with the core code as ...
1
vote
1answer
39 views

Tensorflow slower on GPU than on CPU

Using Keras with Tensorflow backend, I am trying to train an LSTM network and it is taking much longer to run it on a GPU than a CPU. I am training an LSTM network using the fit_generator function. ...
-2
votes
2answers
35 views

How to swap from gpu to a cpu only?

Hi I was wondering how can run a machine learning code onto my CPU instead of a GPU? I have tried making the GPU false on the setting file but it hasn't been able to fix it. ### global settings ...
0
votes
0answers
18 views

The Opengles using much cpu time when I render it in the background thread

I developed an Opengles use the Opengles3 on iOS, I had read the OpenGL ES Programming Guide, Many techniques introduced in OpenGL ES Programming Guide are aimed specifically at creating OpenGL apps ...
0
votes
0answers
14 views

Why is it my application which contains DWM thumbnail lags when the target app was moved to second monitor?

Scenario: I have a wpf fullscreen borderless application that displays a live thumbnail of a borderless, fullscreen UWP app that plays video. My laptop setup includes 1 extended monitor thru HDMI ...
-1
votes
0answers
42 views

Python code unable to access GPU CUDA driver

When I try to execute Python test scripts that attempt to invoke the GPU (compute capability 6 and correctly installed/functional CUDA 8), the CLI flags an error. As far as I can tell, Python (32 bit) ...
0
votes
0answers
10 views

tensorflow.python.framework.errors_impl.InternalError: GPU sync failed

I wan't to train an LSTM network on GPU (Nvidia Quadro P5000), tensorflow-gpu 1.13 is installed. The network: model = Sequential() model.add(LSTM(256, return_sequences=True, input_shape=(SEQ_LEN, ...
1
vote
1answer
17 views

How to implement Pytorch 1D crosscorrelation for long signals in fourier domain?

I have a series of signals length n = 36,000 which I need to perform crosscorrelation on. Currently, my cpu implementation in numpy is a little slow. I've heard Pytorch can greatly speed up tensor ...
-1
votes
0answers
10 views

Can not get tensorflow gpu to install

Trying to install and run Tensorflow gpu version. pip install tensorflow-gpu==2.0.0-beta1 I get "Failed to load the native TensorFlow runtime." I have followed the tensorflow gpu in stallion, ...
-1
votes
0answers
8 views

Linux shows no GPU process but Python code showing CUDA out of memory error

I am working on a remote Linux machine through Putty. I want to run python codes. The error that the machine learning code on Python gives is : RuntimeError: CUDA out of memory. Tried to allocate ...
0
votes
0answers
8 views

Turicreate model training is not using the full GPU

I am trying to train a object detector model using turicreate, but it is not using the full system resources and I think it is taking more time to train the model. Is there any setting which can make ...
0
votes
0answers
7 views

How to process WARC files faster?

I'm extracting links from WARC files with Python on single server and using multiple cores with running multiple instances of parser script. What I want to know is, is it possible to process/parse ...
0
votes
1answer
36 views

How to make templated Compute Kernels in Metal

I have been writing some Metal compute kernels. So, I wrote a kernel with the following declaration: kernel void myKernel(const device uint32_t *inData [[buffer(MyKernelIn)]], device uint32_t ...
0
votes
0answers
11 views

Is there's a way to use opencv with GPU acceleration on Google Colab?

I am trying to train a Haar classifier on Colab using !opencv_traincascade. Based on the times it takes to train stage 1-5,it would take me about 3 days to finish stage 20. Is there's a way to take ...
1
vote
1answer
24 views

Pycuda 2019.1, how to properly copy a gpuarray?

Pycuda has a long standing bug in which it doesn't appear to preserve order or strides when copying ie: import numpy as np import pycuda.autoinit from pycuda import gpuarray np_array = np.array([[1,...
1
vote
1answer
30 views

Not able to use GPU on Google Cloud Compute Engine

I used Google Cloud Compute Engine and made an Instance with 8vCPUs and 30GB memory and Nvidia V100 GPU, using a Windows Server 2019 Datacenter Desktop experience I checked the display device box as ...
1
vote
0answers
25 views

Tensorflow: Using multiple GPU

i have a really big Matrix (A) which i want to slice and then run the matrix multiplication on two different GPU's, because the the GPU`S only have 16GB memory. The problem is (in my opinion) that ...
2
votes
0answers
30 views

What's the best way to block on a GPU operation in TensorFlow's Eager mode?

I would like to know the recommended way to wait for a GPU operation to complete in TensorFlow Eager mode. Operations that are located on a GPU device appear to execute asynchronously (I could not ...
0
votes
2answers
23 views

How to use a remote machine's GPU in jupyter notebook

I am trying to run tensorflow on a remote machine's GPU through Jupyter notebook. However, if I print the available devices using tf, I only get CPUs. I have never used a GPU before and am relatively ...
0
votes
0answers
24 views

Scaling deployments using gpu based on demand

I am currently deploying GPU instances and scaling them with duty cycle. But it is not quite good metric. We have a deployment which is using gpu's. And it exposes a rest api where other jobs / pods ...
0
votes
0answers
26 views

Tensorflow training crashes exceeds 10% of system memory although training with batch size of 1

Training with a batch size of 128/64/32 used to simply empty out the GPU memory after several epochs. However, running stochastic batch training actually makes the program stuck at 0% of the 1st ...
1
vote
0answers
34 views

tf.test.is_gpu_available() returns False on GCP

I am training a CNN on GCP's notebook using a Tesla V100. I've trained a simple yolo on my own custom data and it was pretty fast but not very accurate. So, I decided to write my own code from scratch ...
0
votes
0answers
5 views

How to install TeraChem on GPU?

I have been assigned to install TeraChem on GPU. Besides reading the user manual, can anyone help me to suggest any platform which shares hand on about the installation or share your experience or ...
0
votes
0answers
26 views

Keras fit_generator not using all cores on COLAB gpu

I am running a Keras model with tensorflow backend. It works on a desktop - although all 16 cores of my windows machine are not used. I ported it to Googles' COLAB and ran it there. I am using keras' ...
0
votes
0answers
27 views

JupyterHub: Running notebook workload on multiple GPUs

Is there a way to connect a JupyterHub spawned notebook that runs within a kubernetes cluster to multiple GPUs that are located: a) on the same worker node b) on other nodes Thank you for your ...
0
votes
0answers
15 views

Restrict resources for tensorflow on a single GPU

I run some tensorflow.keras code (with the official tensorflow docker image) on my local machine with a RTX2080 GPU. Since the memory and GPU-Util is nearly maxed out the response of the machine ...
0
votes
0answers
19 views

How to run a java application with Tensorflow GPU on windows 10?

I have been trying to run a tensorflow gpu java application on windows 10. I have not been able to run the application on GPU each time run on CPU. I followed the instructions from https://www....
1
vote
0answers
93 views

Global memory access coalescing in CUDA - Maxwell architecture

I have code for matrices multiplication running on my Geforce 940m (Maxwell architecture) with CUDA compute capability 5.0. I have used NVIDIA Visual Profiler to measure number of global load ...
1
vote
1answer
27 views

TensorFlow on multiple machines with multiple GPUs?

I'm a new in machine learning and Tensorflow. I have a question about distributed training in TensorFlow. I've read about multi GPUs environments and it looks that it is quite possible (https://www....
1
vote
0answers
14 views

Does tf.contrib.eager.list_devices() work on AI Platform machines?

I am trying to set up a Tensorflow training job on AI Platform and want to be able to dynamically configure how many GPUs to use. Currently to get the number of GPUs, I am using: ...
0
votes
1answer
54 views

Visual Basic find GPU dedicated memory NVIDIA or AMD

I am trying to find users dedicated RAM on their GPU. I found code online that is great for finding different GPU properties using WMI, but none of the properties returns anything like "4 GB" for ...
0
votes
1answer
26 views

QueryperformanceFrequency and QueryperformanceCounter Quick Sort GPU Programming OpenCL

I'm trying to execute Quick Sort algorithm on GPU using OpenCL. I found a package developed by Intel titled "GPU-Quicksort in OpenCL 2.0: Nested Parallelism and Work-Group Scan Functions". However ...
0
votes
1answer
66 views

Understanding openAI gym and Optuna hyperparameter tuning using GPU multiprocessing

I am training a reinforcement learning agent using openAI's stable-baselines. I'm also optimising the agents hyperparameters using optuna. To speed up the process, I am using multiprocessing in ...
-1
votes
0answers
30 views

cudaMemGetInfo from multiple processes behaves inconsistently on Windows 10

When running two (or more) programs utilizing CUDA (v10.1) at the same time, I am observing significant discrepancies in the behavior of cudaMemGetInfo. I have two GTX 2080 graphics cards (each with ...
0
votes
1answer
14 views

multi_gpu_model : object of type 'NoneType' has no len()

I am getting this error while using keras multi_gpu_model. The code run fines if I eliminate this line. Also, with CNN model it works fines, it's just that while dense network it gives the error. ...
0
votes
1answer
35 views

GPU memory usage of cudNN lstm and acceleration

I have a couple of questions about a statefull cuDNN LSTM model I'm trying to fit in R using keras library. I have tensorflow-gpu installed and it seems to be running sucessfully. The first thing I'm ...
-3
votes
2answers
60 views

What does a program (the assembly) that uses a GPU even look like?

From this answer, it seems that GPU manufacturers just provide a driver for particular GPU APIs, and that there's no such thing as GPU assembly or at the very least, there will never be a GPU assembly ...
0
votes
1answer
21 views

How to set proper bitrate for ffmpeg with hevc_nvenc?

When I transcode video to H265 with following command, I get a bitrate about 600K and the quality is almost the same as the original. ffmpeg -i data2.mp4 -c:v libx265 -c:a copy d2.mp4 However when I ...
0
votes
1answer
45 views

Using Vulkan unique handle in struct leads to “implicitly deleted” error

I have this in my code: struct Buffer { vk::UniqueBuffer buffer; vk::UniqueDeviceMemory memory; unsigned int top{0}; }; struct Image ...
0
votes
1answer
33 views

How can I know the GPU values about DirectX in regedit?

Regedit Directx informations I have found this register and I need to know how I can do convert and take real values for these items: DriverVersion LastSeen MaxD3D11FeatureLevel MaxD3D12FeatureLevel ...
-1
votes
0answers
34 views

Multi GPU model on keras slower than single GPU model

I'm training a GAN model on a 4*titan XP GPU workstation. I noticed that the compute speed was around 90% when I trained it on single GPU and the compute speed was less than 20% per GPU when I ran it ...
0
votes
0answers
23 views

How to handle inter-queue-family generalization in Vulkan for Compute work

I am learning Vulkan and was interested in achieving a good multi threaded compute framework. My goal is to increase GPU occupancy by using a compute queue per thread. However, I am working on a 32 ...
1
vote
0answers
29 views

Initialisation of MTLArgumentEncoder

So, I am making a Metal Compute application. And according to good practices on Metal, MTLComputePipelineState should be initialised in the beginning of the program, because it is expensive. To make a ...
0
votes
1answer
41 views

Atomic operation between integrated GPU and CPU

Hi I'm working on developing an application, which involves working on shared data between GPU and CPU. I know I can do atomic operation GPU and CPU separately. And also I don't want to use event ...
2
votes
0answers
31 views

How to resolve the problems when using the KinfuLS package in PCL? Thanks

everybody! I am trying to PCL, especially its GPU implementation of matching cubes algorithm. I use PCL 1.9.1 under Ubuntu 18.04. However, I encounter some problems as listed below when I try to use ...
0
votes
0answers
16 views

Can't run tensorflow-node-gpu

I can't run @tensorflow/tfjs-node-gpu. Did I miss any important step? I have CUDA, Python and I can install the module It just won't run My current code is just a simple require: const tf = ...
-1
votes
0answers
17 views

Tensorflow gpu use only cpu insted gpu on windows 10

I have a problem running tensorflow gpu computation on Windows 10. I am using nvidia RXT 2060 with properly installed drivers(419.67), cuda 10, cudnn 7.5 and tensorflow-gpu. When I run: a = tf....
0
votes
1answer
19 views

One gpu uses more memory than others during training

I use multigpu to train a model with pytorch. One gpu uses more memory than others, causing "out-of-memory". Why would one gpu use more memory? Is it possible to make the usage more balanced? Is there ...