[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 09/10] dma-buf-map: Add memcpy and pointer-increment interfaces



Hi

Am 05.11.20 um 11:07 schrieb Linus Walleij:
> Overall I like this, just an inline question:
> 
> On Tue, Oct 20, 2020 at 2:20 PM Thomas Zimmermann <tzimmermann@xxxxxxx> wrote:
> 
>> To do framebuffer updates, one needs memcpy from system memory and a
>> pointer-increment function. Add both interfaces with documentation.
> 
> (...)
>> +/**
>> + * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
>> + * @dst:       The dma-buf mapping structure
>> + * @src:       The source buffer
>> + * @len:       The number of byte in src
>> + *
>> + * Copies data into a dma-buf mapping. The source buffer is in system
>> + * memory. Depending on the buffer's location, the helper picks the correct
>> + * method of accessing the memory.
>> + */
>> +static inline void dma_buf_map_memcpy_to(struct dma_buf_map *dst, const 
>> void *src, size_t len)
>> +{
>> +       if (dst->is_iomem)
>> +               memcpy_toio(dst->vaddr_iomem, src, len);
>> +       else
>> +               memcpy(dst->vaddr, src, len);
>> +}
> 
> Are these going to be really big memcpy() operations?

Individually, each could be a scanline, so a few KiB. (4 bytes *
horizontal resolution). Updating a full framebuffer can sum up to
several MiB.

> 
> Some platforms have DMA offload engines that can perform memcpy(),They could 
> be
> drivers/dma, include/linux/dmaengine.h
> especially if the CPU doesn't really need to touch the contents
> and flush caches etc.
> An example exist in some MTD drivers that move large quantities of
> data off flash memory like this:
> drivers/mtd/nand/raw/cadence-nand-controller.c
> 
> Notice that DMAengine and DMAbuf does not have much in common,
> the names can be deceiving.
> 
> The value of this varies with the system architecture. It is not just
> a question about performance but also about power and the CPU
> being able to do other stuff in parallel for large transfers. So *when*
> to use this facility to accelerate memcpy() is a delicate question.
> 
> What I'm after here is if these can be really big, do we want
> (in the long run, not now) open up to the idea to slot in
> hardware-accelerated memcpy() here?

We currently use this functionality for the graphical framebuffer
console that most DRM drivers provide. It's non-accelerated and slow,
but this has not been much of a problem so far.

Within DRM, we're more interested in removing console code from drivers
and going for the generic implementation.

Most of the graphics HW allocates framebuffers from video RAM, system
memory or CMA pools and does not really need these memcpys. Only a few
systems with small video RAM require a shadow buffer, which we flush
into VRAM as needed. Those might benefit.

OTOH, off-loading memcpys to hardware sounds reasonable if we can hide
it from the DRM code. I think it all depends on how invasive that change
would be.

Best regards
Thomas

> 
> Yours,
> Linus Walleij
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer

Attachment: OpenPGP_0x680DC11D530B7A23.asc
Description: application/pgp-keys

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.