[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] Mirage-Block-Unix Error



Hi,

On Sat, Mar 5, 2016 at 2:53 PM, Rupert Horlick <rh572@xxxxxxxxx> wrote:
Okay, so I ran it with strace and the offending line is:

write(4, "\377\377\377\377\377\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = -1 EINVAL (Invalid argument)

I then looked up EINVAL errors and found the following description of cases when EINVAL is returned:

fd is attached to an object which is unsuitable for writing; or the file was opened with the O_DIRECT flag, and either the address specified in buf, the value specified in count, or the current file offset is not suitably aligned.

Then I looked at where the file was opened and found that the O_DIRECT flag is used:

open("disk12.img", O_RDWR|O_DIRECT) = 4

So I looked up O_DIRECT and found this:

The O_DIRECT flag on its own makes an effort to transfer data synchronously, but does not give the guarantees of the O_SYNC flag that data and necessary metadata are transferred. To guarantee synchronous I/O, O_SYNC must be used in addition to O_DIRECT.

So I modified mirage-block-unix to add O_SYNC in odirect_stubs.c, because I thought that might help, but no such luck.

We default to O_DIRECT to act more like a raw block device, bypassing any cache in the host OS. It makes the Unix case more like the Xen case, and if we observe poor performance (perhaps through lack of caching) on Unix, then we know it'll probably be poor on Xen too, and provoke us to think about adding a cache ourselves :-)
 

Luckily I then noticed that prepending "buffered:" to the filename forces buffered IO, and that actually fixed my problem.

Aha interesting -- if it works in buffered mode then I bet the buffers aren't sector-aligned. Try allocating your buffers with something like

`let page = Io_page.(to_cstruct (get 1))`

Also make sure to supply whole numbers of sectors -- a buffer of length 1 will not work, instead you would have to perform a read of a sector, a modify of a byte, and then a write of a sector.
 
I'm not sure what the exact reason for the failure is, but I thought the insight that I gained was interesting and might lead to someone fixing it. For now I will just use the workaround though!

There are certainly things we could improve in this area:

- the error message is poor and doesn't really help identify the problem. It would be quite straightforward to check the required alignment and length constraints, emit a helpful log message and then abort the program.

- we don't do a good job of reflecting the alignment or length of buffers in signatures. We've discussed copying versus zero-copy interfaces a few times in the past though-- a copying interface would probably limit block performance on very fast devices but would be easier to use.

- the string prefix `buffered:` is a poor interface: I think a better way to do this is to add a default boolean argument to the `connect` function rather than string splitting.

Cheers,
Dave
 


Rupert


On Sat, Mar 5, 2016 at 11:57 AM Thomas Leonard <talex5@xxxxxxxxx> wrote:
On 5 March 2016 at 08:25, Rupert Horlick <rh572@xxxxxxxxx> wrote:
> Hi all,
>
> I'm running into a very strange issue using mirage-block-unix and I was
> wondering if anyone had some insight.
>
> The core of the issue is that connecting to the same file (disk.img or
> whatever) and writing in two different locations is giving completely
> different results.
>
> If I connect and write to offset 0 in my unix home directory (an NFS
> directory), then everything is completely fine. If I then go to
> /local/scratch (local scratch space on a department machine, so it's on the
> directly attached disk), and connect and write in exactly the same way I get
> the following error:
>
> (Failure "write: Invalid argument in write '' at file disk.img offset 0 with
> length 512")
>
> I've done the same thing in various other locations and never run into this
> error before. I've triple checked permissions and everything should be
> completely fine. Any ideas?

Try running it with "strace". That should show you the actual error
from the kernel.


--
Dr Thomas Leonard        http://roscidus.com/blog/
GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA

_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel




--
Dave Scott
_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.