Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.12.7 problems with GPU_MEM=16 in config.txt #503

Closed
amtssp opened this issue Jan 18, 2014 · 23 comments
Closed

3.12.7 problems with GPU_MEM=16 in config.txt #503

amtssp opened this issue Jan 18, 2014 · 23 comments

Comments

@amtssp
Copy link

amtssp commented Jan 18, 2014

Hi

3.12.6 and 3.12.7 kernels won't fully boot if GPU_MEM in config.txt is reduced to 16 MB.

Maybe it is booting, however, only the raspberry image is present on the screen in the top left corner. No keyboard input is shown.

However, the LEDs seems to indicate that that the kernel has booted, but I'm unable to produce any text on the screen.

@popcornmix
Copy link
Collaborator

Does it work for you with 3.10.y kernel?

@msperl
Copy link
Contributor

msperl commented Feb 2, 2014

I believe I may see a similar issue with 3.12 (a93bfa0) and 3.13(6928683), but it is not necessarily related to GPU_MEM=16, but in my case to:
config.txt:

gpu_mem_256=112
gpu_mem_512=368
cma_lwm=16
cma_hwm=32
cma_offline_start=16

cmdline.txt:
coherent_pool=6M smsc95xx.turbo_mode=N dwc_otg.lpm_enable=0 console=ttyAMA0,115
200 kgdboc=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 eleva
tor=deadline rootwait

as recommended in: http://www.raspberrypi.org/phpBB3/viewtopic.php?f=29&t=19334&start=125
then I get:

[    0.018694] ------------[ cut here ]------------
[    0.018771] WARNING: CPU: 0 PID: 1 at mm/page_alloc.c:2483 __alloc_pages_nodemask+0x1ac/0x89c()
[    0.018819] Modules linked in:
[    0.018859] CPU: 0 PID: 1 Comm: swapper Not tainted 3.12.9+ #20
[    0.018945] [<c0013fc0>] (unwind_backtrace+0x0/0xf0) from [<c0011264>] (show_stack+0x10/0x14)
[    0.019022] [<c0011264>] (show_stack+0x10/0x14) from [<c001ed1c>] (warn_slowpath_common+0x68/0x88)
[    0.019089] [<c001ed1c>] (warn_slowpath_common+0x68/0x88) from [<c001ed58>] (warn_slowpath_null+0x1c/0x24)
[    0.019160] [<c001ed58>] (warn_slowpath_null+0x1c/0x24) from [<c00a02cc>] (__alloc_pages_nodemask+0x1ac/0x89c)
[    0.019237] [<c00a02cc>] (__alloc_pages_nodemask+0x1ac/0x89c) from [<c00172d0>] (__dma_alloc_buffer.isra.20+0x2c/0xb8)
[    0.019308] [<c00172d0>] (__dma_alloc_buffer.isra.20+0x2c/0xb8) from [<c0017370>] (__alloc_remap_buffer.isra.23+0x14/0xa0)
[    0.019392] [<c0017370>] (__alloc_remap_buffer.isra.23+0x14/0xa0) from [<c0597514>] (atomic_pool_init+0x6c/0x10c)
[    0.019464] [<c0597514>] (atomic_pool_init+0x6c/0x10c) from [<c000851c>] (do_one_initcall+0x40/0x180)
[    0.019530] [<c000851c>] (do_one_initcall+0x40/0x180) from [<c0593b34>] (kernel_init_freeable+0xe8/0x1b4)
[    0.019601] [<c0593b34>] (kernel_init_freeable+0xe8/0x1b4) from [<c04121c4>] (kernel_init+0x8/0xe4)
[    0.019667] [<c04121c4>] (kernel_init+0x8/0xe4) from [<c000e158>] (ret_from_fork+0x14/0x3c)
[    0.019766] ---[ end trace da227214a82491b7 ]---
[    0.019808] DMA: failed to allocate 6144 KiB pool for atomic coherent allocation
[    0.020494] cpuidle: using governor ladder
...
[    1.525473] ------------[ cut here ]------------
[    1.531277] WARNING: CPU: 0 PID: 1 at arch/arm/mm/dma-mapping.c:491 __dma_alloc+0x20c/0x254()
[    1.542012] coherent pool not initialised!
[    1.547254] Modules linked in:
[    1.551445] CPU: 0 PID: 1 Comm: swapper Tainted: G        W    3.12.9+ #20
[    1.559479] [<c0013fc0>] (unwind_backtrace+0x0/0xf0) from [<c0011264>] (show_stack+0x10/0x14)
[    1.570309] [<c0011264>] (show_stack+0x10/0x14) from [<c001ed1c>] (warn_slowpath_common+0x68/0x88)
[    1.581718] [<c001ed1c>] (warn_slowpath_common+0x68/0x88) from [<c001edd0>] (warn_slowpath_fmt+0x30/0x40)
[    1.593911] [<c001edd0>] (warn_slowpath_fmt+0x30/0x40) from [<c0017608>] (__dma_alloc+0x20c/0x254)
[    1.605593] [<c0017608>] (__dma_alloc+0x20c/0x254) from [<c0017770>] (arm_dma_alloc+0x80/0x98)
[    1.617061] [<c0017770>] (arm_dma_alloc+0x80/0x98) from [<c05a8b0c>] (vchiq_platform_init+0x3c/0x1fc)
[    1.629342] [<c05a8b0c>] (vchiq_platform_init+0x3c/0x1fc) from [<c05a8a10>] (vchiq_init+0xe0/0x1a0)
[    1.641711] [<c05a8a10>] (vchiq_init+0xe0/0x1a0) from [<c000851c>] (do_one_initcall+0x40/0x180)
[    1.653793] [<c000851c>] (do_one_initcall+0x40/0x180) from [<c0593b34>] (kernel_init_freeable+0xe8/0x1b4)
[    1.666756] [<c0593b34>] (kernel_init_freeable+0xe8/0x1b4) from [<c04121c4>] (kernel_init+0x8/0xe4)
[    1.679325] [<c04121c4>] (kernel_init+0x8/0xe4) from [<c000e158>] (ret_from_fork+0x14/0x3c)
[    1.691197] ---[ end trace da227214a82491b8 ]---
[    1.697545] vchiq: Unable to allocate channel memory
[    1.704485] vchiq: could not load vchiq
...
several more such exceptions...

So it may be related to this in some respect resulting in DMA memory not being allocateable.
Maybe it is related to Firmware?

But then: the same settings work with a 3.11(8f768c5) kernel

One surprising thing though: access to everything other Device on the USB is working fine (via the serial console) - only the USB NIC code seems to fail...

Martin

@msperl
Copy link
Contributor

msperl commented Feb 2, 2014

P.s: only removing: coherent_pool=6M
and

cma_lwm=16
cma_hwm=32
cma_offline_start=16

makes the device boot without any of those "exceptions/traces" and only then network is working - otherwise there are no packets received, but TX counters goes up for DHCP, but I am not sure if the packets really get on the wire...

@msperl
Copy link
Contributor

msperl commented Feb 2, 2014

And if you look at the very early details it shows: a difference between 3.11

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.11.10+ (root@raspberrypi) (gcc version 4.6.3 (Debian 4.6.3-14+rpi1) ) #21 PREEMPT Sat Feb 1 19:53:59 UTC 2014
[    0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
[    0.000000] Machine: BCM2708
[    0.000000] early_vc_cma_mem(0/0x14c00000@0xa000000)
[    0.000000]  -> initial 0, size 14c00000, base a000000<6>[    0.000000] cma: CMA: reserved 332 MiB at 0a000000
[    0.000000] cma: CMA: reserved 16 MiB at 08000000
[    0.000000] Memory policy: ECC disabled, Data cache writeback
[    0.000000] On node 0 totalpages: 121856
[    0.000000] free_area_init_node: node 0, pgdat c05d7b20, node_mem_map c06830$
[    0.000000]   Normal zone: 984 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 121856 pages, LIFO batch:31
[    0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[    0.000000] pcpu-alloc: [0] 0
...

and 3.13:

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.13.0+ (root@raspberrypi) (gcc version 4.6.3 (Deb$
[    0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), c$
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instru$
[    0.000000] Machine: BCM2708
[    0.000000] early_vc_cma_mem(0/0x14c00000@0xa000000)
[    0.000000]  -> initial 0, size 14c00000, base a000000<3>[    0.000000] vc_cma: dma_declare_contiguous(14c00000,a000000) failed
[    0.000000] Memory policy: Data cache writeback
[    0.000000] On node 0 totalpages: 121856
[    0.000000] free_area_init_node: node 0, pgdat c05f86b4, node_mem_map c06a60$
[    0.000000]   Normal zone: 984 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 121856 pages, LIFO batch:31
[    0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[    0.000000] pcpu-alloc: [0] 0

You see the missing newline in the message "initial,..." which results in a concatenated subsequent line (which should get fixed) and then the vc_cma_dma_declare_contiguous error for the 3.13 kernel?
You also see that the data cache policy has changed from "ECC disabled, Data cache writeback"
to "Data cache writeback"...

So something VERY early during the initialization fails already... (and it is the same error for 3.12)

@amtssp
Copy link
Author

amtssp commented Mar 9, 2014

It is still not booting fully in 3.13.y kernels if GPU_MEM is reduced to 16.

3.10.y kernels are OK
It hangs with just the raspberry icon in the top left corner.

@popcornmix
Copy link
Collaborator

I'm not seeing this just booted and:

pi@raspberrypi:~ $ uname -a
Linux raspberrypi 3.13.6+ #938 PREEMPT Fri Mar 7 17:36:24 GMT 2014 armv6l GNU/Linux
pi@raspberrypi:~ $ vcgencmd version
Mar  7 2014 16:43:19 
Copyright (c) 2012 Broadcom
version c96cb035fdc907d28db836bbb1606aea2a8e73d9 (clean) (release)
pi@raspberrypi:~ $ vcgencmd get_mem gpu
gpu=16M
pi@raspberrypi:~ $ vcgencmd get_mem arm
arm=496M
pi@raspberrypi:~ $ free
             total       used       free     shared    buffers     cached
Mem:        496756      53192     443564          0         20      28592
-/+ buffers/cache:      24580     472176
Swap:            0          0          0

Do you have any config.txt settings or cmdline.txt settings that are non-default?

@msperl
Copy link
Contributor

msperl commented Mar 9, 2014

See my comments about CMA setup - hence - non-fixed memory split...

@popcornmix
Copy link
Collaborator

CMA is not officially supported. I was asking @amtssp who I believe isn't using CMA.

@msperl
Copy link
Contributor

msperl commented Mar 9, 2014

OK - that is something new, which you may want to document somewhere.
I found out about the options (which are of most interest to a model A + camera use-cases) via: http://elinux.org/RPiconfig
I will add a comment there that it is not officially supported...

@popcornmix
Copy link
Collaborator

@msperl
We tried getting it working, but it's never been reliable for me and so isn't enabled as standard.
You always seem to end up with a flood of kernel alloc failures under heavy load involving network (e.g. using midori). It doesn't get officially tested.

If someone who knows CMA well and understands what causes these alloc failures wants to help get it working, then we'd be interested in getting it working well.

@msperl
Copy link
Contributor

msperl commented Mar 9, 2014

I tried it on mine with gpu_mem=16 and it booted to gpu_mem=128
only when setting it to gpu_mem=32 or higher it was gpu_mem=32 after booting.

Here the example for the setting of 31MB for GPU:

root@raspberrypi:~# uptime
 18:23:23 up 4 min,  1 user,  load average: 0.02, 0.14, 0.08
root@raspberrypi:~# grep gpu_mem /boot/config.txt 
gpu_mem_256=31
gpu_mem_512=31
root@raspberrypi:~# free
             total       used       free     shared    buffers     cached
Mem:        382988      62236     320752          0      13864      26572
-/+ buffers/cache:      21800     361188
Swap:       262140          0     262140
root@raspberrypi:~# vcgencmd get_mem gpu
gpu=128M
root@raspberrypi:~# vcgencmd get_mem arm
arm=384M
root@raspberrypi:~# vcgencmd version
Jan 29 2014 14:58:39 
Copyright (c) 2012 Broadcom
version 0f547430c65eae8761de21ee72246bf0dc3bbf79 (clean) (release)

This could indicate that it is no longer allowed to run with GPU_MEM<32M with newer versions of the firmware and the issue of @amtssp is related to a FW version that still allowed it but triggered a bug in USB...

@popcornmix
Copy link
Collaborator

@msperl but my log shows gpu_mem=16 is supported by latest firmware/kernel.
Any value less than 32 causes start_cd.elf and fixup_cd.dat to be used.
My guess is you have a missing/broken start_cd.elf, or you are overriding the start file in config.txt.

@msperl
Copy link
Contributor

msperl commented Mar 9, 2014

the only thing that i override is kernel=... to use my compiled kernel - independently to what the rpi-update does provide...

But disabling start_x=1 (which gets enabled with the camera) gets me back to 16MB...

@popcornmix
Copy link
Collaborator

Yes of course. There are three start files, a cutdown one (start_cd.elf), a normal one (start.elf) and an extended one (start_x.elf).
You can't have the cutdown memory usage along with the extended feature set.
To use the camera you need at least gpu_mem=64M.

popcornmix pushed a commit to raspberrypi/firmware that referenced this issue Mar 10, 2014
…f-two channels

firmware: hdmi: Allow hdmi channel map to be overridden with a gencmd

firmware: audioplus: limit sample rates to ones supported by hardware

firmware: mailbox: Add property to get memory handle from dispmanx resource
See: #257

firmware: Allow interrupts to be masked from GPU (e.g. when arm is handling them)
See: #257

firmware: memory reduction of cutdown firmware (saves about 1M)
See: raspberrypi/linux#503
popcornmix pushed a commit to Hexxeh/rpi-firmware that referenced this issue Mar 10, 2014
…f-two channels

firmware: hdmi: Allow hdmi channel map to be overridden with a gencmd

firmware: audioplus: limit sample rates to ones supported by hardware

firmware: mailbox: Add property to get memory handle from dispmanx resource
See: raspberrypi/firmware#257

firmware: Allow interrupts to be masked from GPU (e.g. when arm is handling them)
See: raspberrypi/firmware#257

firmware: memory reduction of cutdown firmware (saves about 1M)
See: raspberrypi/linux#503
@popcornmix
Copy link
Collaborator

@amtssp
I've done some pruning on start_cd.elf and saves about 1M.
If your problem was gpu exhausting its share of memory then this may have been fixed.
Can you test?

@amtssp
Copy link
Author

amtssp commented May 1, 2014

Sorry for my late reply.
I'm now at kernel 3.14.1 but still gpu_mem=16 causes the boot process to stop when the raspberry image is shown. But still the LEDs seems to flicker as they use to do when it is booting normally.

If I use gpu_mem=32 it boots fully and everything is fine.

@msperl
Copy link
Contributor

msperl commented May 1, 2014

Can you connect a serial console and see what you get on the console when booting gpu_mem=16 and share it?

Alternatively if you do not have a serial console at hand you can try the following:
erase: `/var/log/syslog``

then reboot with gpu_mem=16

and then after some time with your "blinking" led boot back again with gpu_mem=32?

then check /var/log/syslog and check if you get 2 blocks of lines similar to this:

May  1 08:38:28 raspberrypi kernel: imklog 5.8.11, log source = /proc/kmsg started.
May  1 08:38:28 raspberrypi rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="1900" x-info="http://www.rsyslog.com"] start
May  1 08:38:28 raspberrypi kernel: [    0.000000] Booting Linux on physical CPU 0x0
May  1 08:38:28 raspberrypi kernel: [    0.000000] Initializing cgroup subsys cpu
May  1 08:38:28 raspberrypi kernel: [    0.000000] Initializing cgroup subsys cpuacct
May  1 08:38:28 raspberrypi kernel: [    0.000000] Linux version 3.13.4+ (root@raspberrypi) (gcc version 4.6.3 (Debian 4.6.3-14+rpi1) ) #27 PREEMPT Mon Mar 31 12:01:05 UTC 2014
May  1 08:38:28 raspberrypi kernel: [    0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5387d
May  1 08:38:28 raspberrypi kernel: [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
May  1 08:38:28 raspberrypi kernel: [    0.000000] Machine: BCM2708

This would indicate that the kernel is booting far enough to get to a point where it can log its boot-messages to SD-card.

Then please share all those kernel: lines - especially the ones from the boot with gpu_mem=16
(note that you may want to sanitize these lines: Kernel command line: dma.dmachans=0x7f35 bcm2708_fb.fbwidth=6... removing the values of bcm2708.serial=... and smsc95xx.macaddr=...)

If you do not see the first boot, then we need to get the output from the serial console.

@popcornmix
Copy link
Collaborator

@amtssp
Can you confirm if this happens with latest 3.12.23 kernel?
Are you using CMA?

@amtssp
Copy link
Author

amtssp commented Jun 30, 2014

@popcornmix
Sorry I have not tested with 3.12.23 yet,
However, I just tried to copy the newest firmware to my 3.14.2 kernel.
It still does not boot fully with gpu mem=16 (stops with the raspberry image in the top left corner). If I increase gpu mem to 32 it boots fully.

I will try to find time to test the 3.12.23 later.

No I don't use CMA

@popcornmix
Copy link
Collaborator

Anything non-default in cmdline.txt or config.txt?
I tried booting with gpu_mem=16 yesterday and it was fine.

@P33M
Copy link
Contributor

P33M commented Jul 14, 2015

Closing as no reponse from OP. If gpu_mem=16 is still broken then post a comment explaining what configuration this was used with.

@P33M P33M closed this as completed Jul 14, 2015
@NicoHood
Copy link

NicoHood commented Jan 9, 2016

What about cma nowadays?

@popcornmix is it now supported or not? The wiki links to this github issue. However we are now on a newer kernel and we have got new hardware ready. I think CMA can be a great feature, so how far is the development? Maybe one could open a new ticket for that?

@popcornmix
Copy link
Collaborator

CMA is not supported.

neuschaefer pushed a commit to neuschaefer/raspi-binary-firmware that referenced this issue Feb 27, 2017
…f-two channels

firmware: hdmi: Allow hdmi channel map to be overridden with a gencmd

firmware: audioplus: limit sample rates to ones supported by hardware

firmware: mailbox: Add property to get memory handle from dispmanx resource
See: raspberrypi#257

firmware: Allow interrupts to be masked from GPU (e.g. when arm is handling them)
See: raspberrypi#257

firmware: memory reduction of cutdown firmware (saves about 1M)
See: raspberrypi/linux#503
popcornmix pushed a commit that referenced this issue Apr 27, 2020
[ Upstream commit 022e9d6 ]

In the macsec_changelink(), "struct macsec_tx_sa tx_sc" is used to
store "macsec_secy.tx_sc".
But, the struct type of tx_sc is macsec_tx_sc, not macsec_tx_sa.
So, the macsec_tx_sc should be used instead.

Test commands:
    ip link add dummy0 type dummy
    ip link add macsec0 link dummy0 type macsec
    ip link set macsec0 type macsec encrypt off

Splat looks like:
[61119.963483][ T9335] ==================================================================
[61119.964709][ T9335] BUG: KASAN: slab-out-of-bounds in macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.965787][ T9335] Read of size 160 at addr ffff888020d69c68 by task ip/9335
[61119.966699][ T9335]
[61119.966979][ T9335] CPU: 0 PID: 9335 Comm: ip Not tainted 5.6.0+ #503
[61119.967791][ T9335] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[61119.968914][ T9335] Call Trace:
[61119.969324][ T9335]  dump_stack+0x96/0xdb
[61119.969809][ T9335]  ? macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.970554][ T9335]  print_address_description.constprop.5+0x1be/0x360
[61119.971294][ T9335]  ? macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.971973][ T9335]  ? macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.972703][ T9335]  __kasan_report+0x12a/0x170
[61119.973323][ T9335]  ? macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.973942][ T9335]  kasan_report+0xe/0x20
[61119.974397][ T9335]  check_memory_region+0x149/0x1a0
[61119.974866][ T9335]  memcpy+0x1f/0x50
[61119.975209][ T9335]  macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.975825][ T9335]  ? macsec_get_stats64+0x3e0/0x3e0 [macsec]
[61119.976451][ T9335]  ? kernel_text_address+0x111/0x120
[61119.976990][ T9335]  ? pskb_expand_head+0x25f/0xe10
[61119.977503][ T9335]  ? stack_trace_save+0x82/0xb0
[61119.977986][ T9335]  ? memset+0x1f/0x40
[61119.978397][ T9335]  ? __nla_validate_parse+0x98/0x1ab0
[61119.978936][ T9335]  ? macsec_alloc_tfm+0x90/0x90 [macsec]
[61119.979511][ T9335]  ? __kasan_slab_free+0x111/0x150
[61119.980021][ T9335]  ? kfree+0xce/0x2f0
[61119.980700][ T9335]  ? netlink_trim+0x196/0x1f0
[61119.981420][ T9335]  ? nla_memcpy+0x90/0x90
[61119.982036][ T9335]  ? register_lock_class+0x19e0/0x19e0
[61119.982776][ T9335]  ? memcpy+0x34/0x50
[61119.983327][ T9335]  __rtnl_newlink+0x922/0x1270
[ ... ]

Fixes: 3cf3227 ("net: macsec: hardware offloading infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants