Forum: US 99 BBS

Virtual Private Server (VPS)
Hosting provided by Central
Point Networking
cpnllc.com

Who's Online

For some reason, the 
"Nodelist" and  "Recent
Callers" features are 
not working.

Recent Visitors
- Ray Quinn
  Tue Feb 27 09:53:21 2024
  from Visalia, CA via HTTPS
- Ray Quinn
  Sun Mar 3 17:23:06 2024
  from Visalia, CA via HTTPS
- Ray Quinn
  Wed Mar 20 19:01:41 2024
  from Visalia, CA via HTTPS
- Ray Quinn
  Mon Mar 25 19:54:41 2024
  from Visalia, CA via HTTP

System Info

Sysop:	Ray Quinn
Location:	Visalia, CA
Users:	50
Nodes:	10 (0 / 10)
Uptime:	62:36:34
Calls:	2
Files:	11,885
Messages:	148,295

Check out the US 99 
menu above for links 
to information about 
US Highway 99, after 
which the US 99 BBS 
is named.

Be sure to click on 
the Amateur Radio 
menu item above for 
packet BBSes, packet 
software, packet 
organizations, as 
well as packet 
how-to's. Also 
included is links to 
local and some 
not-so-local Amateur 
Radio Clubs.

Re: [PATCH v5 1/4] usb: dbc: early driver for xhci debug capability

From Lu Baolu@110:300/11 to All on Thu Jan 26 04:30:02 2017

Hi Ingo,

On 01/25/2017 05:23 PM, Ingo Molnar wrote:

* Lu Baolu <baolu.lu@linux.intel.com> wrote:

Hiding essentially an early udelay() implementation in an early-printk driver is
ugly and counterproductive.

Sure. How about below change?

diff --git a/drivers/usb/early/xhci-dbc.c b/drivers/usb/early/xhci-dbc.c
index d3f0c84..940989e 100644
--- a/drivers/usb/early/xhci-dbc.c
+++ b/drivers/usb/early/xhci-dbc.c
@@ -587,6 +587,35 @@ static int xdbc_bulk_transfer(void *data, int size, bool read)
return size;
}

+static void __init xdbc_udelay_calibration(void)
+{
+ unsigned long lpj = 0;
+ unsigned int tsc_khz, cpu_khz;
+
+ if (!boot_cpu_has(X86_FEATURE_TSC))
+ goto calibration_out;
+
+ cpu_khz = x86_platform.calibrate_cpu();
+ tsc_khz = x86_platform.calibrate_tsc();
+
+ if (tsc_khz == 0)
+ tsc_khz = cpu_khz;
+ else if (abs(cpu_khz - tsc_khz) * 10 > tsc_khz)
+ cpu_khz = tsc_khz;
+
+ if (!tsc_khz)
+ goto calibration_out;
+
+ lpj = tsc_khz * 1000;
+ do_div(lpj, HZ);
+
+calibration_out:
+ if (!lpj)
+ lpj = 1 << 22;
+
+ loops_per_jiffy = lpj;
+}
+
static int __init xdbc_early_setup(void)
{
int ret;
@@ -686,6 +715,8 @@ int __init early_xdbc_parse_parameter(char *s)
}
xdbc.xdbc_reg = (struct xdbc_regs __iomem *)(xdbc.xhci_base + offset);

+ xdbc_udelay_calibration();
+
return 0;
}

Yeah - so could we do this in a more generic fashion, not in the early-printk

driver but in core x86 code?

How about below changes?

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 4cfba94..aab7faa 100644
- --- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -835,6 +835,26 @@ dump_kernel_offset(struct notifier_block *self, unsigned long v, void *p)
return 0;
}

+static void __init simple_udelay_calibration(void)
+{
+ unsigned int tsc_khz, cpu_khz;
+ unsigned long lpj;
+
+ if (!boot_cpu_has(X86_FEATURE_TSC))
+ return;
+
+ cpu_khz = x86_platform.calibrate_cpu();
+ tsc_khz = x86_platform.calibrate_tsc();
+
+ tsc_khz = tsc_khz ? : cpu_khz;
+ if (!tsc_khz)
+ return;
+
+ lpj = tsc_khz * 1000;
+ do_div(lpj, HZ);
+ loops_per_jiffy = lpj;
+}
+
/*
* Determine if we were loaded by an EFI loader. If so, then we have also been
* passed the efi memmap, systab, etc., so we should use these data structures @@ -983,6 +1003,8 @@ void __init setup_arch(char **cmdline_p)
*/
x86_configure_nx();

+ simple_udelay_calibration();
+
parse_early_param();

If it's okay for you, do you want it in a separated patch or part of patch 2/4?

Best regards,
Lu Baolu

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

From Lu Baolu@110:300/11 to All on Thu Jan 26 04:40:01 2017

Hi,

On 01/26/2017 12:16 AM, Peter Zijlstra wrote:

On Wed, Jan 25, 2017 at 11:51:34PM +0800, Lu Baolu wrote:

What is timeout and why?

Put it in simple:

The driver sets the RUN bit in control register and polls READY
bit in status register for the successful USB device enumeration.
As the USB device enumeration might fail and the READY bit will
never be set, the driver must have a timeout logic to avoid
endless loop.

More details:

The operational model is that driver sets up all necessary registers
and data structures, and then starts the debug engine by setting
the RUN/STOP bit in the control register.

The debug engine then brings up itself as a ready-for-enumeration
USB device. The USB link between host and device starts link training
and then host will detect the connected device. The hub driver in
host will then starts the USB device enumeration processes (as defined
in USB spec). If everything goes smoothly, the device gets enumerated
and host can talk with the debug device.

After that, xdbc firmware will set the READY bit in status register. And
the driver can go ahead with data transfer over USB.

I have vague memories from a prior discussion where you said this READY
state can be lost at any time (cable unplug or whatnot) and at that
point the driver should re-start the setup, right?

Yes. So the documentation requires users not to unplug the usb
cable during debugging. This rule applies to other debug methods
as well.

If there is an error other than !ready, I would
expect the hardware to inform you of this through another status bit,
no?

Yeah, this might be another choice of hardware design. But it's not a
topic for this driver.

So is there really no way to way to distinguish between "I did setup and
am waiting for READY", "I did setup, am waiting for READY, but things
got hosed" and "I was READY, things be hosed" ?

I suppose the first and last can be distinguished by remembering if you
ever saw READY, but the first and second are the interesting case I
think.

So why can't you poll indefinitely for either ready or error?

Even if the hardware has both ready and error status bits, it's still
nice to have a time out watch dog. Buggy hardware or firmware
might not set any of these bits. Polling indefinitely might result in
a endless loop.

Loosing output, esp. without indication, is very _very_ annoying when
you're debugging things. Its just about on par with a stuck system, at
least then you know something bad happened.

Fair enough.

USB connection is stable enough, unless the user unplugs the
USB cable during debugging.

Best regards,
Lu Baolu

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

From Ingo Molnar@110:300/11 to All on Thu Jan 26 08:20:01 2017

* Lu Baolu <baolu.lu@linux.intel.com> wrote:

Fair enough.

USB connection is stable enough, unless the user unplugs the
USB cable during debugging.

What does the hardware do in this case? The XHCI registers are in the host hardware, so they won't disappear, right? Is there some cable connection status

bit we can extract without interrupts?

I.e. if there's any polling component then it would be reasonable to add an error
component: poll the status and if it goes 'disconnected' then disable early-printk
altogether in this case and trigger an emergency printk() so that there's chance
that the user notices [if the system does not misbehave otherwise].

I.e. try to be as robust and informative as lockdep - yet don't lock up the host
kernel: lockdep too is called from very deep internals, there are various conditions where it sees corrupt data structures (i.e. a 'disconnect' - a system
environment outside the normal bounds of operation), yet of the kernel and over

the last 10+ years of lockdep's existence we had very, very few cases of lockdep
itself locking up and behaving unpredictably.

Thanks,

Ingo

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

From Ingo Molnar@110:300/11 to All on Thu Jan 26 08:30:01 2017

* Lu Baolu <baolu.lu@linux.intel.com> wrote:

Hi,

On 01/25/2017 10:38 PM, Peter Zijlstra wrote:

On Wed, Jan 25, 2017 at 08:27:38PM +0800, Lu Baolu wrote:

In my driver, udelay() is mostly used to handle time out.

Xdbc hides most USB things in its firmware. Early printk driver only needs >> to setup the registers/data structures and wait until link ready or time

out.

Without udelay(), I have no means to convert the polling times into

waiting

time.

What is timeout and why?

Put it in simple:

The driver sets the RUN bit in control register and polls READY
bit in status register for the successful USB device enumeration.
As the USB device enumeration might fail and the READY bit will
never be set, the driver must have a timeout logic to avoid
endless loop.

Is there any error status available in the host registers anywhere that tells us
that enumeration did not succeed?

Thanks,

Ingo

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

From Lu Baolu@110:300/11 to All on Thu Jan 26 08:50:02 2017

Hi Ingo,

On 01/26/2017 03:19 PM, Ingo Molnar wrote:

* Lu Baolu <baolu.lu@linux.intel.com> wrote:

Fair enough.

USB connection is stable enough, unless the user unplugs the
USB cable during debugging.

What does the hardware do in this case? The XHCI registers are in the host hardware, so they won't disappear, right? Is there some cable connection

status

bit we can extract without interrupts?

Yes, there are register bits for us to know the cable status. I will go
through the spec again and give you more accurate answer later.

I'm sorry. I will be off during the next 7 days for Chinese New Year
holiday. My email access will be very limited during this time. I will
revisit this thread after I am back from holiday.

Sorry for the inconvenience.

Best regards,
Lu Baolu

I.e. if there's any polling component then it would be reasonable to add an

error

component: poll the status and if it goes 'disconnected' then disable

early-printk

altogether in this case and trigger an emergency printk() so that there's

chance

that the user notices [if the system does not misbehave otherwise].

I.e. try to be as robust and informative as lockdep - yet don't lock up the

host

kernel: lockdep too is called from very deep internals, there are various conditions where it sees corrupt data structures (i.e. a 'disconnect' - a

system

environment outside the normal bounds of operation), yet of the kernel and

over

the last 10+ years of lockdep's existence we had very, very few cases of

lockdep

itself locking up and behaving unpredictably.

Thanks,

Ingo

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

From Ingo Molnar@110:300/11 to All on Thu Jan 26 09:20:01 2017

* Lu Baolu <baolu.lu@linux.intel.com> wrote:

Hi Ingo,

On 01/26/2017 03:19 PM, Ingo Molnar wrote:

* Lu Baolu <baolu.lu@linux.intel.com> wrote:

Fair enough.

USB connection is stable enough, unless the user unplugs the
USB cable during debugging.

What does the hardware do in this case? The XHCI registers are in the host hardware, so they won't disappear, right? Is there some cable connection

status

bit we can extract without interrupts?

Yes, there are register bits for us to know the cable status. I will go through the spec again and give you more accurate answer later.

Ok, that's good news - so we don't really have to time out and we don't have to

rely on the user holding the phone right either.

I'm sorry. I will be off during the next 7 days for Chinese New Year
holiday. My email access will be very limited during this time. I will revisit this thread after I am back from holiday.

Sorry for the inconvenience.

No problem, have fun!

Thanks,

Ingo

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

From Peter Zijlstra@110:300/11 to All on Thu Jan 26 11:30:03 2017

On Thu, Jan 26, 2017 at 08:19:37AM +0100, Ingo Molnar wrote:

* Lu Baolu <baolu.lu@linux.intel.com> wrote:

Fair enough.

USB connection is stable enough, unless the user unplugs the
USB cable during debugging.

What does the hardware do in this case? The XHCI registers are in the host hardware, so they won't disappear, right? Is there some cable connection

status

bit we can extract without interrupts?

I.e. if there's any polling component then it would be reasonable to add an

error

component: poll the status and if it goes 'disconnected' then disable

early-printk

altogether in this case and trigger an emergency printk() so that there's

chance

that the user notices [if the system does not misbehave otherwise].

That'll be fun when printk() == early_printk() :-)

I myself wouldn't mind the system getting stuck until the link is re-established. My own damn fault for taking that cable out etc.

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

From Ingo Molnar@110:300/11 to All on Thu Jan 26 17:10:01 2017

* Peter Zijlstra <peterz@infradead.org> wrote:

On Thu, Jan 26, 2017 at 08:19:37AM +0100, Ingo Molnar wrote:

* Lu Baolu <baolu.lu@linux.intel.com> wrote:

Fair enough.

USB connection is stable enough, unless the user unplugs the
USB cable during debugging.

What does the hardware do in this case? The XHCI registers are in the host hardware, so they won't disappear, right? Is there some cable connection

status

bit we can extract without interrupts?

I.e. if there's any polling component then it would be reasonable to add an

error

component: poll the status and if it goes 'disconnected' then disable

early-printk

altogether in this case and trigger an emergency printk() so that there's

chance

that the user notices [if the system does not misbehave otherwise].

That'll be fun when printk() == early_printk() :-)

My suggestion would be to just print into the printk buffer directly in this case,
without console output - the developer will notice it in 'dmesg'.

I myself wouldn't mind the system getting stuck until the link is re-established. My own damn fault for taking that cable out etc.

That's fine too, although beyond the obvious "yanked the cable without realizing
it" case there are corner cases where usability is increased massively if the kernel is more proactive about error conditions: for example there are sub-standard USB cables and there are too long USB pathways from overloaded USB

hubs which can result in intermittent behavior, etc.

A clear diagnostic message in 'dmesg' that the USB host controller is unhappy about the USB-debug dongle device is a _lot_ more useful when troubleshooting such
problems than the occasional weird, non-deterministic hang...

Thanks,

Ingo

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

From Peter Zijlstra@110:300/11 to All on Thu Jan 26 18:50:01 2017

On Thu, Jan 26, 2017 at 05:01:05PM +0100, Ingo Molnar wrote:

I.e. if there's any polling component then it would be reasonable to add

an error

component: poll the status and if it goes 'disconnected' then disable

early-printk

altogether in this case and trigger an emergency printk() so that there's

chance

that the user notices [if the system does not misbehave otherwise].

That'll be fun when printk() == early_printk() :-)

My suggestion would be to just print into the printk buffer directly in this

case,

without console output - the developer will notice it in 'dmesg'.

When you map printk() onto early_printk() dmesg will be empty, there
will be nothing there, and therefore no reason what so ever to look
there. I certainly don't ever look there.

Note that the printk buffer itself is a major part of why printk sucks
donkey balls. Not to mention that you really cannot have an
early_printk() implementation that depends on printk().

I myself wouldn't mind the system getting stuck until the link is re-established. My own damn fault for taking that cable out etc.

That's fine too, although beyond the obvious "yanked the cable without

realizing

it" case there are corner cases where usability is increased massively if the

kernel is more proactive about error conditions: for example there are sub-standard USB cables and there are too long USB pathways from overloaded

USB

hubs which can result in intermittent behavior, etc.

A clear diagnostic message in 'dmesg' that the USB host controller is unhappy

about the USB-debug dongle device is a _lot_ more useful when troubleshooting

such

problems than the occasional weird, non-deterministic hang...

Sure, I'm just not sure what or where makes sense.

If your serial cable is bad you notice because you don't receive the
right amount of characters and or stuff gets mangled. You chuck the
cable and get a new one.

I think the most important part is re-establishing the link when the
cable gets re-inserted. Maybe we should just drop all characters written
when there's no link and leave it at that, same as serial.

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

From Ingo Molnar@110:300/11 to All on Fri Jan 27 08:00:01 2017

* Peter Zijlstra <peterz@infradead.org> wrote:

On Thu, Jan 26, 2017 at 05:01:05PM +0100, Ingo Molnar wrote:

I.e. if there's any polling component then it would be reasonable to

add an error

component: poll the status and if it goes 'disconnected' then disable

early-printk

altogether in this case and trigger an emergency printk() so that

there's chance

that the user notices [if the system does not misbehave otherwise].

That'll be fun when printk() == early_printk() :-)

My suggestion would be to just print into the printk buffer directly in

this case,

without console output - the developer will notice it in 'dmesg'.

When you map printk() onto early_printk() dmesg will be empty, there
will be nothing there, and therefore no reason what so ever to look
there.

Unless you want a third layer of a console driver putting the debug message into
dmesg isn't all that bad of a solution.

Let's admit it: something like USB that involves external pieces of hardware _does_ have failure modes, and troubleshooting messages instead of indefinite hangs are obviously more robust.

I certainly don't ever look there.

You'll have to teach yourself that if the box boots up fine but there are no messages whatsoever from the early-printk console that you'll need to look at dmesg output or the syslog for more clues.

This should not be a common occurrance in any case - but when it happens it's very
useful to have diagnostic messages. I don't think this is a controversial point
in
any fashion.

Note that the printk buffer itself is a major part of why printk sucks donkey

balls. Not to mention that you really cannot have an early_printk() implementation that depends on printk().

There are several easy solutions to do that, my favorite would be to put it into
the printk buffer totally unlocked. When your early-printk is active it's unused
and in the end it's a known data structure after all:

/*
* Just zap whatever's in the printk buffer and put your emergency message into

* it, prominently. No locking, no worries - don't generate emergency messages
* while printk is active and syslogd is running - this facility is a poor man's
* fallback printk() when early-printk has taken over all kernel logging:
*/
void printk_emergency_puts(const char *str)
{
struct printk_log *msg, *msg_end;

msg = log_buf;

memset(msg, 0, sizeof(*msg));
msg.text_len = strlen(str);

msg_end = (void *)msg + sizeof(*msg) + msg->text_len;

/* Zero ->len denotes end of log buffer: */
memset(msg_end, 0, sizeof(*msg_end));

snprintf(ptr, str);
}

....
printk_emergency_puts"earlyprintk emergency: Hardware timed out, shutting
down. Fix your debug cable?\n");
....

(Or so - totally untested, some details might be wrong.)

But yes, I agree with your wider point, I just looked at kernel/printk/printk.c

and puked. Why did we merge that crappy piece of binary logging code, when we already have two other binary logging facilities in the kernel already, both of

them better and cleaner than this?? Why did we mess up our nicely readable, simple, reliable ASCII log buffer printk code? :-(

I myself wouldn't mind the system getting stuck until the link is re-established. My own damn fault for taking that cable out etc.

That's fine too, although beyond the obvious "yanked the cable without realizing it" case there are corner cases where usability is increased massively if the kernel is more proactive about error conditions: for

example

there are sub-standard USB cables and there are too long USB pathways from overloaded USB hubs which can result in intermittent behavior, etc.

A clear diagnostic message in 'dmesg' that the USB host controller is

unhappy

about the USB-debug dongle device is a _lot_ more useful when

troubleshooting

such problems than the occasional weird, non-deterministic hang...

Sure, I'm just not sure what or where makes sense.

If your serial cable is bad you notice because you don't receive the right amount of characters and or stuff gets mangled. You chuck the cable and get a

new one.

I think the most important part is re-establishing the link when the cable

gets

re-inserted. Maybe we should just drop all characters written when there's no

link and leave it at that, same as serial.

That would be fine with me too - but even in this case there should be a stat counter somewhere (in /proc or /debug) that counts the number of characters dropped. Maybe that file could also display an emergency string - avoiding the interaction with the printk buffer.

We can do better than passive-aggressive logging behavior...

Thanks,

Ingo

--- MBSE BBS v1.0.6.13 (GNU/Linux-x86_64
* Origin: linux.* mail to news gateway (110:300/11@linuxnet)

Who's Online

Recent Visitors

System Info

Re: [PATCH v5 1/4] usb: dbc: early driver for xhci debug capability