free_netdev and locking - linux-kernel

is it required to unlock net_device structure before calling free_netdev? The code I encountered does the following:
static void delete_dev(struct net_device *dev)
{
ASSERT_RTNL();
...
unregister_netdevice(dev);
...
rtnl_unlock();
free_netdev(dev);
rtnl_lock();
}
int foo()
{
struct net_device *dev;
rtnl_lock();
...
delete_dev(dev);
rtnl_unlock();
return 0;
}
Is this the right way to do the things? Thanks.

You should not unlock-lock rtnl one more time.
Here is why: there is not a word in Network Drivers API in free_netdev section about lock required. Nevertheless, unregister_netdevice requires lock to be held and it has been wrapped in API in unregister_netdev function, which is stated in the document.
Anyway, if you look into some popular driver's sources, e1000e for example, you will see this:
6432 static void __devexit e1000_remove(struct pci_dev *pdev)
6433 {
...
6459 unregister_netdev(netdev);
6460
6461 if (pci_dev_run_wake(pdev))
6462 pm_runtime_get_noresume(&pdev->dev);
6463
6464 /*
6465 * Release control of h/w to f/w. If f/w is AMT enabled, this
6466 * would have already happened in close and is redundant.
6467 */
6468 e1000e_release_hw_control(adapter);
6469
6470 e1000e_reset_interrupt_capability(adapter);
6471 kfree(adapter->tx_ring);
6472 kfree(adapter->rx_ring);
6473
6474 iounmap(adapter->hw.hw_addr);
6475 if (adapter->hw.flash_address)
6476 iounmap(adapter->hw.flash_address);
6477 pci_release_selected_regions(pdev,
6478 pci_select_bars(pdev, IORESOURCE_MEM));
6479
6480 free_netdev(netdev);
6481
6482 /* AER disable */
6483 pci_disable_pcie_error_reporting(pdev);
6484
6485 pci_disable_device(pdev);
6486 }
As you can see, there is no unlock-lock taken.
Moreover, they use unregister_netdev function so that it would only be locked inside unregister_netdevice itself, and all dozens of deinitialinizations there would go outside the lock. Therefore, consider using simple unregister_netdev as they(netdev kernel developers) recommend in comments to it's source, if you think you can afford that.

Related

Rust debugging doesn't stop at the breakpoints when debugging stm32f407 via openocd and gdb

I have a problem debugging an stm32f407vet6 board and rust code.
The point of the problem is that GDB ignores breakpoints.
After setting breakpoints and executing the "continue" command in gdb, the program continues to ignore all breakpoints.
The only way to stop the program running is to cause an interrupt using the "ctrl + c" command.
After this command, the board stops its execution on the line currently being executed.
I have tried to set breakpoints on all lines where I can set them, but all the attempts are unsuccessful.
$ openocd
Open On-Chip Debugger 0.10.0 (2020-07-01) [https://github.com/sysprogs/openocd]
Licensed under GNU GPL v2
libusb1 09e75e98b4d9ea7909e8837b7a3f00dda4589dc3
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "hla_swd". To override use 'transport select <transport>'.
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : clock speed 2000 kHz
Error: libusb_open() failed with LIBUSB_ERROR_NOT_SUPPORTED
Info : STLINK V2J35S7 (API v2) VID:PID 0483:3748
Info : Target voltage: 6.436364
Info : stm32f4x.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : starting gdb server for stm32f4x.cpu on 3333
Info : Listening on port 3333 for gdb connections
$ arm-none-eabi-gdb -q target\thumbv7em-none-eabihf\debug\test_blink
Reading symbols from target\thumbv7em-none-eabihf\debug\test_blink...
(gdb) target remote :3333
Remote debugging using :3333
0x00004070 in core::ptr::read_volatile (src=0xe000e010) at C:\Users\User\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib/rustlib/src/rust\src/libcore/ptr/mod.rs:1005
1005 pub unsafe fn read_volatile<T>(src: *const T) -> T {
(gdb) load
Loading section .vector_table, size 0x1a8 lma 0x0
Loading section .text, size 0x47bc lma 0x1a8
Loading section .rodata, size 0xbf0 lma 0x4970
Start address 0x47a2, load size 21844
Transfer rate: 100 KB/sec, 5461 bytes/write.
(gdb) b main
Breakpoint 1 at 0x1f2: file src\main.rs, line 15.
(gdb) continue
Continuing.
Program received signal SIGINT, Interrupt.
0x00001530 in cortex_m::peripheral::syst::<impl cortex_m::peripheral::SYST>::has_wrapped (self=0x1000fc6c)
at C:\Users\User\.cargo\registry\src\github.com-1ecc6299db9ec823\cortex-m-0.6.3\src\peripheral/syst.rs:135
135 pub fn has_wrapped(&mut self) -> bool {
(gdb) bt
#0 0x00001530 in cortex_m::peripheral::syst::<impl cortex_m::peripheral::SYST>::has_wrapped (self=0x1000fc6c)
at C:\Users\User\.cargo\registry\src\github.com-1ecc6299db9ec823\cortex-m-0.6.3\src\peripheral/syst.rs:135
#1 0x00003450 in <stm32f4xx_hal::delay::Delay as embedded_hal::blocking::delay::DelayUs<u32>>::delay_us (self=0x1000fc6c, us=500000)
at C:\Users\User\.cargo\registry\src\github.com-1ecc6299db9ec823\stm32f4xx-hal-0.8.3\src/delay.rs:69
#2 0x0000339e in <stm32f4xx_hal::delay::Delay as embedded_hal::blocking::delay::DelayMs<u32>>::delay_ms (self=0x1000fc6c, ms=500)
at C:\Users\User\.cargo\registry\src\github.com-1ecc6299db9ec823\stm32f4xx-hal-0.8.3\src/delay.rs:32
#3 0x00000318 in test_blink::__cortex_m_rt_main () at src\main.rs:40
#4 0x000001f6 in main () at src\main.rs:15
memory.x file:
MEMORY
{
/* NOTE 1 K = 1 KiBi = 1024 bytes */
/* TODO Adjust these memory regions to match your device memory layout */
/* These values correspond to the LM3S6965, one of the few devices QEMU can emulate */
CCMRAM : ORIGIN = 0x10000000, LENGTH = 64K
RAM : ORIGIN = 0x20000000, LENGTH = 128K
FLASH : ORIGIN = 0x00000000, LENGTH = 512K
}
/* This is where the call stack will be allocated. */
/* The stack is of the full descending type. */
/* You may want to use this variable to locate the call stack and static
variables in different memory regions. Below is shown the default value */
_stack_start = ORIGIN(CCMRAM) + LENGTH(CCMRAM);
/* You can use this symbol to customize the location of the .text section */
/* If omitted the .text section will be placed right after the .vector_table
section */
/* This is required only on microcontrollers that store some configuration right
after the vector table */
/* _stext = ORIGIN(FLASH) + 0x400; */
/* Example of putting non-initialized variables into custom RAM locations. */
/* This assumes you have defined a region RAM2 above, and in the Rust
sources added the attribute `#[link_section = ".ram2bss"]` to the data
you want to place there. */
/* Note that the section will not be zero-initialized by the runtime! */
/* SECTIONS {
.ram2bss (NOLOAD) : ALIGN(4) {
*(.ram2bss);
. = ALIGN(4);
} > RAM2
} INSERT AFTER .bss;
*/
openocd.cfg file:
# Sample OpenOCD configuration for the STM32F3DISCOVERY development board
# Depending on the hardware revision you got you'll have to pick ONE of these
# interfaces. At any time only one interface should be commented out.
# Revision C (newer revision)
source [find interface/stlink.cfg]
# Revision A and B (older revisions)
# source [find interface/stlink-v2.cfg]
source [find target/stm32f4x.cfg]
# use hardware reset, connect under reset
# reset_config none separate
main.rs file:
#![no_main]
#![no_std]
#![allow(unsafe_code)]
// Halt on panic
#[allow(unused_extern_crates)] // NOTE(allow) bug rust-lang/rust#53964
extern crate panic_halt; // panic handler
use cortex_m;
use cortex_m_rt::entry;
use stm32f4xx_hal as hal;
use crate::hal::{prelude::*, stm32};
#[entry]
fn main() -> ! {
if let (Some(dp), Some(cp)) = (
stm32::Peripherals::take(),
cortex_m::peripheral::Peripherals::take(),
) {
let rcc = dp.RCC.constrain();
let clocks = rcc
.cfgr
.sysclk(168.mhz())
.freeze();
let mut delay = hal::delay::Delay::new(cp.SYST, clocks);
let gpioa = dp.GPIOA.split();
let mut l1 = gpioa.pa6.into_push_pull_output();
let mut l2 = gpioa.pa7.into_push_pull_output();
loop {
l1.set_low().unwrap();
l2.set_high().unwrap();
delay.delay_ms(500u32);
l1.set_high().unwrap();
l2.set_low().unwrap();
delay.delay_ms(500u32);
}
}
loop {}
}
Cargo.toml file:
[package]
name = "test_blink"
version = "0.1.0"
authors = ["Alex"]
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
embedded-hal = "0.2"
nb = "0.1.2"
cortex-m = "0.6"
cortex-m-rt = "0.6"
# Panic behaviour, see https://crates.io/keywords/panic-impl for alternatives
panic-halt = "0.2"
cortex-m-log="0.6.2"
[dependencies.stm32f4xx-hal]
version = "0.8.3"
features = ["rt", "stm32f407"]
I am new to rust embedded and maybe I have done something wrong, but I have already tried all the options I can find on the Internet.
At first I thought it was a problem with the cortex-debug plugin for vscode and even created the issue, but the guys couldn't help me because the problem is obviously not on their side.
Debugging "C" code in cubeIDE works, so I dare to assume that the problem is somewhere in rust--gdb--openocd. Perhaps I am missing something, but unfortunately I cannot find it myself yet.
I would appreciate any resources or ideas to solve this problem.
I'm hoping you checked out this resources:
Discovery - debug
From your screen-grab of arm-none-eabi-gdb it does indeed look it it did not hit the break point.
you should have seen this message afterwards:
Note: automatically using hardware breakpoints for read-only addresses.
Breakpoint 1, main () at ...
Did you compile your source with symbols, and unoptimised?
Your config all looks right to me otherwise.

Unable to see the plugin compiled in the custom wireshark run?

I am following the foo example given in the wireshark documentation.
I am able to build the foo code plugin. I am using wireshark 3.0.1 version. In the workroot folder, I have updated the target - PLUGIN_SRC_DIRS - plugins/epan/foo just before gryphon.
I can see that my code builds because I got some compilation error which I was able to fix it.
My foo code lives inside the plugins/epan folder.
I am running custom wireshark - sudo ./run/wireshark
There is a surprise here that I can't see even gryphon protocol field in the running wireshark. So in order to test this, I am typing foo or gryphon in that display filter and it turns red and it say foo is neither a protocol nor a protocol field. I am using Ubuntu 16.04 LTS to build it. The build goes fine.
Here is packet-foo.c
#include "config.h"
#include <epan/packet.h>
#include "packet-foo.h"
static int proto_foo = -1;
static int dissect_foo(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree _U_, void *data _U_);
void
proto_register_foo(void)
{
proto_foo = proto_register_protocol (
"FOO Protocol", /* name */
"FOO", /* short name */
"foo" /* abbrev */
);
}
void
proto_reg_handoff_foo(void)
{
static dissector_handle_t foo_handle;
foo_handle = create_dissector_handle(dissect_foo, proto_foo);
dissector_add_uint("udp.port", FOO_PORT, foo_handle);
}
static int
dissect_foo(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree _U_, void *data _U_)
{
col_set_str(pinfo->cinfo, COL_PROTOCOL, "FOO");
/* Clear out stuff in the info column */
col_clear(pinfo->cinfo,COL_INFO);
return tvb_captured_length(tvb);
}
Here is the packet-foo.h
#define FOO_PORT 1234
The CMakeLists.txt is here, this is actually a copy of gryphon.
So, I am wondering if gryphon wasn't recognised that means foo won't be recognised too. So, this file might be a source of problem.
include(WiresharkPlugin)
# Plugin name and version info (major minor micro extra)
set_module_info(foo 0 0 4 0)
set(DISSECTOR_SRC
packet-foo.c
)
set(PLUGIN_FILES
plugin.c
${DISSECTOR_SRC}
)
set_source_files_properties(
${PLUGIN_FILES}
PROPERTIES
COMPILE_FLAGS "${WERROR_COMMON_FLAGS}"
)
include_directories(${CMAKE_CURRENT_SOURCE_DIR})
register_plugin_files(plugin.c
plugin
${DISSECTOR_SRC}
)
add_plugin_library(foo epan)
target_link_libraries(foo epan)
install_plugin(foo epan)
file(GLOB DISSECTOR_HEADERS RELATIVE "${CMAKE_CURRENT_SOURCE_DIR}" "*.h")
CHECKAPI(
NAME
foo
SWITCHES
-g abort -g termoutput -build
SOURCES
${DISSECTOR_SRC}
${DISSECTOR_HEADERS}
)
Merely changing the plugin isn't sufficient.
You need to modify the top make file so that foo is actually installed.
vim CMakeListsCustom.txt.example
Firstly, uncomment - line number 16
plugins/epan/foo
Since your foo lives inside plugins/epan/foo
Now, rename this example to
mv CMakeListsCustom.txt.example CMakeListsCustom.txt
vim CMakeLists.txt
Insert a line number around 1408-
plugins/epan/foo
After that, do make
and then sudo make install
Here is the working copy -
https://github.com/joshis1/WiresharkDissectorFoo

CUDA: how to use barrier.sync

I have read https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-bar which details about PTX synchronization function.
It says there are 16 "barrier logical resource", and you can specify which barrier to use with the parameter "a". What is a barrier logical resource?
I have a piece of code from an outside source, which I know works. However, I cannot understand the syntax used inside "asm" and what "memory" does. I assume "name" replaces "%0" and "numThreads" replace "%1", but what is "memory" and what are the colons doing?
__device__ __forceinline__ void namedBarrierSync(int name, int numThreads) {
asm volatile("bar.sync %0, %1;" : : "r"(name), "r"(numThreads) : "memory");}
In a block of 256 threads, I only want threads 64 ~ 127 to synchronize. Is this possible with barrier.sync
function? ( for an example, say I have a grid of 1 block, block of 256 threads. we split the block into 3 conditional branches s.t. threads 0 ~ 63 go into kernel1, threads 64 ~ 127 go into kernel 2, and threads 128 ~ 255 go into kernel 3. I want threads in kernel 2 to only synchronize among themselves. So if I use the "namedBarrierSync" function defied above: "namedBarrierSync( 1, 64)". Then does it synchronize only threads 64 ~ 127, or threads 0 ~ 63?
I have tested with below code ( assume that gpuAssert is an error checking function defined somewhere in the file ).
Here is the code:
__global__ void test(int num_threads)
{
if (threadIdx.x >= 64 && threadIdx.x < 128)
{
namedBarrierSync(0, num_threads) ;
}
__syncthreads();
}
int main(void)
{
test<<<1, 1, 256>>>(128);
gpuAssert(cudaDeviceSynchronize(), __FILE__, __LINE_);
printf("complete\n");
return 1;
}
"barrier logical resource" are the hardware necessary to synchronize threads/warps in a thread block (probably atomic counters etc.). You don't need to know the actual hardware implementation to program them, it is sufficient to know there are 16 instances of them available.
As Robert Crovella has pointed out in your cross-post on the Nvidia forum, the documentation for inline PTX is at https://docs.nvidia.com/cuda/inline-ptx-assembly/index.html.
barrier.sync with a named barrier and thread count of 64 synchronizes the first two warps arriving at the named barrier (for compute capability up to 6.x) or the first 64 threads arriving at the named barrier (for compute capability 7.0 onwards).
Your test only launches a single thread (with 256 bytes of shared memory allocated to it), which makes tests of synchronisation instructions moot. You want to launch the test kernel as test<<<1, 256>>>(128); instead.

bpf_asm returns single ' on compile

I am new to Berkeley Packet Filter . I am trying to learn how to hand roll my own bpf code and then compile it using the bpf_asm. I am working on ubuntu 16.04 with kernel 4.4.0-137. I have downloaded the source code and I am working my way through the recommended reading, in Documentation/networking, specifically [filter.txt][1]. I have install binutils, built and installed bpf_asm located in tools/net using the provided makefile. Everything to this point appears to have gone okay. When I run bpf_asm -c bpf_example the program produces a single ' mark to standard out. The code I am trying to compile is the sample code provided in Documentation/networking/filter.txt I am including it here for completeness.
ld [4] /* offsetof(struct seccomp_data, arch) */
jne #0xc000003e, bad /* AUDIT_ARCH_X86_64 */
ld [0] /* offsetof(struct seccomp_data, nr) */
jeq #15, good /* __NR_rt_sigreturn */
jeq #231, good /* __NR_exit_group */
jeq #60, good /* __NR_exit */
jeq #0, good /* __NR_read */
jeq #1, good /* __NR_write */
jeq #5, good /* __NR_fstat */
jeq #9, good /* __NR_mmap */
jeq #14, good /* __NR_rt_sigprocmask */
jeq #13, good /* __NR_rt_sigaction */
jeq #35, good /* __NR_nanosleep */
bad: ret #0 /* SECCOMP_RET_KILL_THREAD */
good: ret #0x7fff0000 /* SECCOMP_RET_ALLOW */
The output that is mentioned in filter.txt is
$ ./bpf_asm -c foo
{ 0x28, 0, 0, 0x0000000c },
{ 0x15, 0, 1, 0x00000806 },
{ 0x06, 0, 0, 0xffffffff },
{ 0x06, 0, 0, 0000000000 },
However, my output is
./bpf_asm -c bpf_example
'
I am clearly messing something up. Can someone point out what I have overlooked, provide suggestions for things to try, or provide additional literature supplementary to filter.txt?
Thank you.
Edit
After reading the code more carefully I found that bpf_exp.y was invoking yyerror. With a message "lex unknown character", which I find somewhat strange since I yanked the text from filter.txt directly into a new file. Playing with bpf_exp.l I found some strange behavior in the production for ., which is used to catch any output that is not caught by the rest of the lexer and output an error. Commenting those out... which is probably a terrible idea, I was able to produce bpf output. However, it isn't equivalent to what filter.txt suggests as the output. It does however contain the same number of lines as the bpf that was input and after running it through the bpf_dbg program I recovered the same output that I input. Is this program no longer maintained? Or I am still not using it correctly? Furthermore, it would seem very difficult for the input bpf program in filter.txt to be output as the suggested program since I don't believe the parser has any sort of optimization for the output code. Hence it would seem like it needs to have the same number of line. Is that a correct assumption?

Linux kernel wl18xx module_init is it generated?

I'm looking at this drivers/net/wireless/ti/wl18xx driver module.
The traditional module_init() is not in the source code. Yet the trace dump shows a wl18xx_driver_init() is called, though that function again is not in the source code.
I can see the wl18xx_driver_init() in the objdump of main.o in that driver directory.
Is it that in late versions of kernels those functions/macros are automatically generated? How is that done?
wl18xx_driver_init is generated here with the expansion of module_platform_driver(wl18xx_driver) macro.
It expands roughly to smth like:
static int __init wl18xx_driver_init(void) {
return platform_driver_register(&(wl18xx_driver));
}
static initcall_t __initcall_wl18xx_driver_init6 __used __attribute__((__section__(".initcall" "6" ".init"))) = wl18xx_driver_init;
static void __exit wl18xx_driver_exit(void) {
platform_driver_unregister(&(wl18xx_driver));
}
static exitcall_t __exitcall_wl18xx_driver_exit __exit_call = wl18xx_driver_exit;
See module_platform_driver macro and module driver macro.
# It would be best to post some source code or links the next time, it would make it easier. Including kernel version would be also a good idea.

Resources