Re: Documentation of Parallel and Serial port configuration?

7 Oct 2005

      Hi Kuba,
On Thursday 06 Oct 2005 23:23, Kuba Ober wrote:
...
...
we can probably do better than inb() / outb().
You can't do any better than that [It] is the only one that makes sense
(when you run things on ia32).
... and when you're not on an ia32 platform with a superIO chip?
...
...
Advantages of using ppdev over simple inb() / outb() are:
should support [*] cross-architecture (arm, alpha, powerpc, ...)
That'd be good for winelib only or wine-with-emulator (bochs? qemu?).
Yup, both.  A ported applications (via winelib or qemu) should work under any 
Linux architecture.  Unfortunately, it would be a Linux-specific solution; 
*-BSDs have their own interface.
...
...
should support [*] some esoteric devices (USB-parallel converters, ...)
At a huge performance penalty ;)
But it would work, 's my point.  The performance of parallel-over-USB is a 
separate issue.
Legacy devices (such as parallel ports) are being gradually faded out.  So 
writing code that requires a SuperIO chip is not best.
...
...
The overhead in doing a syscall isn't significant as any outb() operation
takes ~1us anyway
AFAIK, the overhead stems from the fact that instead of a machine
instruction you have to:

process an exception in the kernel, which then signals SIGSEGV to the

process

invoke the signal handler
determine what's up and disassemble the instruction at CS:EIP
invoke a function/syscall based on the disassembled instruction

If this isn't dog slow, I don't know what is. I wasn't entirely clear, the
syscall is the least of our worries in fact :)
I think you may be confusing some other activity (maybe an invalid memory 
access?).  A syscall is pretty simple.  The application does some bookkeeping 
and calls int(errupt) 0x80, triggering the switch from user-land to 
kernel-land.  The kernel then picks up the request and carries on.  Its 
described here[1], although the details may have changed slightly with more 
recent kernels.  There's no signalling (in the Unix user-land sense) going 
on.
[1] http://www.tldp.org/LDP/khg/HyperNews/get/syscall/syscall86.html
Overhead is "currently" (measured for 2.4.0) at slightly under 0.4us (see 
[2]).  For 2.6-series kernels it may have gone down slightly further, but 
0.4us would seem a reasonable upper-bound.  Assuming the kernel driver is 
reasonably written,  I'd make a complete guess that the overhead is between 
0.4 and 0.6us (although I should benchmark the number :^).
[2] http://cs.nmu.edu/~benchmark/index.php?page=null_call
...
...
I suspect most programs designed to work under Win98 just hit the
hardware, so obtaining permissions (doing ioperm() as root, for example)
should work. If we have some mechanism for catching the program doing
either inb() or outb(), then we could provide a better implement via the
ppdev interface.
At the cost of slowing things down. For devices that bit bang data (like
programmers), this makes things unacceptably slow.
I can't say I share that experience (about being unacceptably slow, that is).  
A 40-60% increase in overhead for a single instruction would be definitely 
noticeable, but only if this is the bottleneck in the program.  Other 
activity takes longer (c.f. context-switching in [2], for example).  Even 
just calling functions take order of 100ns (on my ~700MHz laptop).  The time 
between successive changes of parallel port state might be (much) larger than 
the 400-600ns overhead in using kernel routines, so the overhead becomes less 
significant.  Of course, this would be application specific.
The worse-case would be something driving the parallel port as a square-wave 
generator: you'd get the full 40-60% drop in performance (assuming all the 
above numbers).  Perhaps slightly more realistically, the PLIP interface is 
reckoned[3] to have a 1.2Mbit/s bandwidth, corresponding to a ~3.33us 
turn-around time.  Adding a 0.4-0.6us overhead would reduce the bandwidth to 
between 1.1Mbit/s and 1.0Mbit/s (8-16% performance drop).  Would this matter? 
No, because if it did you'd go out and buy 100baseT cards and achieve far 
greater performance (or Myrinet, or ...).
[3] http://yara.ecn.purdue.edu/~pplinux/ppcluster.html
For the particular use-case you have in mind, my understanding is that 
programmers often require some additional delay mechanism to allow the EPROM 
to keep up (certainly for write, probably for reads too).  This would reduce 
the impact of the performance hit, perhaps acceptably (or even imperceptibly) 
so.
Does all this matter?  Probably not.  I would bet you this smartee here that 
if a program is worrying about ns response of some function, then that 
function its good enough, and that some better "higher level" algorithmic 
optimisation would have a much larger benefit (e.g. ethernet vs PLIP).
Cheers,
Paul.
(apologies for the overly long email!)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: Documentation of Parallel and Serial port configuration?