commit 6437abdb624edc4e68859e66768933d2b5eb9f09 Author: Greg Kroah-Hartman Date: Fri Jan 6 10:40:28 2017 +0100 Linux 4.9.1 commit 705df55bd0cf3530cc7e2b517f77d14585d3d24f Author: Adam Borowski Date: Sun Dec 11 02:09:18 2016 +0100 x86/kbuild: enable modversions for symbols exported from asm commit 334bb773876403eae3457d81be0b8ea70f8e4ccc upstream. Commit 4efca4ed ("kbuild: modversions for EXPORT_SYMBOL() for asm") adds modversion support for symbols exported from asm files. Architectures must include C-style declarations for those symbols in asm/asm-prototypes.h in order for them to be versioned. Add these declarations for x86, and an architecture-independent file that can be used for common symbols. With f27c2f6 reverting 8ab2ae6 ("default exported asm symbols to zero") we produce a scary warning on x86, this commit fixes that. Signed-off-by: Adam Borowski Tested-by: Kalle Valo Acked-by: Nicholas Piggin Tested-by: Peter Wu Tested-by: Oliver Hartkopp Signed-off-by: Michal Marek Signed-off-by: Greg Kroah-Hartman commit c728f2b5edf24bac22b0549e3a5b3622c35b9eba Author: Adam Borowski Date: Thu Nov 24 02:27:03 2016 +0100 builddeb: fix cross-building to arm64 producing host-arch debs commit 152b695d74376bfe55cd2a6265ccc75b0d39dd19 upstream. Both Debian and kernel archs are "arm64" but UTS_MACHINE and gcc say "aarch64". Recognizing just the latter should be enough but let's accept both in case something regresses again or an user sets UTS_MACHINE=arm64. Regressed in cfa88c7: arm64: Set UTS_MACHINE in the Makefile. Signed-off-by: Adam Borowski Acked-by: Riku Voipio Signed-off-by: Michal Marek Signed-off-by: Greg Kroah-Hartman commit e12096297ea55330cabe576596ad3f0b0fa4b8f4 Author: Eric Sandeen Date: Mon Dec 5 12:31:06 2016 +1100 xfs: set AGI buffer type in xlog_recover_clear_agi_bucket commit 6b10b23ca94451fae153a5cc8d62fd721bec2019 upstream. xlog_recover_clear_agi_bucket didn't set the type to XFS_BLFT_AGI_BUF, so we got a warning during log replay (or an ASSERT on a debug build). XFS (md0): Unknown buffer type 0! XFS (md0): _xfs_buf_ioapply: no ops on block 0xaea8802/0x1 Fix this, as was done in f19b872b for 2 other locations with the same problem. Signed-off-by: Eric Sandeen Reviewed-by: Brian Foster Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner Signed-off-by: Greg Kroah-Hartman commit c11a13d6f5274167a7c44e08722162ae3acce437 Author: Eric Sandeen Date: Tue Nov 8 12:55:18 2016 +1100 xfs: fix up xfs_swap_extent_forks inline extent handling commit 4dfce57db6354603641132fac3c887614e3ebe81 upstream. There have been several reports over the years of NULL pointer dereferences in xfs_trans_log_inode during xfs_fsr processes, when the process is doing an fput and tearing down extents on the temporary inode, something like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 PID: 29439 TASK: ffff880550584fa0 CPU: 6 COMMAND: "xfs_fsr" [exception RIP: xfs_trans_log_inode+0x10] #9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs] #10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs] #11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs] #12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs] #13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs] #14 [ffff8800a57bbe00] evict at ffffffff811e1b67 #15 [ffff8800a57bbe28] iput at ffffffff811e23a5 #16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8 #17 [ffff8800a57bbe88] dput at ffffffff811dd06c #18 [ffff8800a57bbea8] __fput at ffffffff811c823b #19 [ffff8800a57bbef0] ____fput at ffffffff811c846e #20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27 #21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c #22 [ffff8800a57bbf50] int_signal at ffffffff8161405d As it turns out, this is because the i_itemp pointer, along with the d_ops pointer, has been overwritten with zeros when we tear down the extents during truncate. When the in-core inode fork on the temporary inode used by xfs_fsr was originally set up during the extent swap, we mistakenly looked at di_nextents to determine whether all extents fit inline, but this misses extents generated by speculative preallocation; we should be using if_bytes instead. This mistake corrupts the in-memory inode, and code in xfs_iext_remove_inline eventually gets bad inputs, causing it to memmove and memset incorrect ranges; this became apparent because the two values in ifp->if_u2.if_inline_ext[1] contained what should have been in d_ops and i_itemp; they were memmoved due to incorrect array indexing and then the original locations were zeroed with memset, again due to an array overrun. Fix this by properly using i_df.if_bytes to determine the number of extents, not di_nextents. Thanks to dchinner for looking at this with me and spotting the root cause. Signed-off-by: Eric Sandeen Reviewed-by: Brian Foster Signed-off-by: Dave Chinner Signed-off-by: Greg Kroah-Hartman commit e67053ad4840676f34215823770ba6d95037d138 Author: Julien Grall Date: Wed Dec 7 12:24:40 2016 +0000 arm/xen: Use alloc_percpu rather than __alloc_percpu commit 24d5373dda7c00a438d26016bce140299fae675e upstream. The function xen_guest_init is using __alloc_percpu with an alignment which are not power of two. However, the percpu allocator never supported alignments which are not power of two and has always behaved incorectly in thise case. Commit 3ca45a4 "percpu: ensure requested alignment is power of two" introduced a check which trigger a warning [1] when booting linux-next on Xen. But in reality this bug was always present. This can be fixed by replacing the call to __alloc_percpu with alloc_percpu. The latter will use an alignment which are a power of two. [1] [ 0.023921] illegal size (48) or align (48) for percpu allocation [ 0.024167] ------------[ cut here ]------------ [ 0.024344] WARNING: CPU: 0 PID: 1 at linux/mm/percpu.c:892 pcpu_alloc+0x88/0x6c0 [ 0.024584] Modules linked in: [ 0.024708] [ 0.024804] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc7-next-20161128 #473 [ 0.025012] Hardware name: Foundation-v8A (DT) [ 0.025162] task: ffff80003d870000 task.stack: ffff80003d844000 [ 0.025351] PC is at pcpu_alloc+0x88/0x6c0 [ 0.025490] LR is at pcpu_alloc+0x88/0x6c0 [ 0.025624] pc : [] lr : [] pstate: 60000045 [ 0.025830] sp : ffff80003d847cd0 [ 0.025946] x29: ffff80003d847cd0 x28: 0000000000000000 [ 0.026147] x27: 0000000000000000 x26: 0000000000000000 [ 0.026348] x25: 0000000000000000 x24: 0000000000000000 [ 0.026549] x23: 0000000000000000 x22: 00000000024000c0 [ 0.026752] x21: ffff000008e97000 x20: 0000000000000000 [ 0.026953] x19: 0000000000000030 x18: 0000000000000010 [ 0.027155] x17: 0000000000000a3f x16: 00000000deadbeef [ 0.027357] x15: 0000000000000006 x14: ffff000088f79c3f [ 0.027573] x13: ffff000008f79c4d x12: 0000000000000041 [ 0.027782] x11: 0000000000000006 x10: 0000000000000042 [ 0.027995] x9 : ffff80003d847a40 x8 : 6f697461636f6c6c [ 0.028208] x7 : 6120757063726570 x6 : ffff000008f79c84 [ 0.028419] x5 : 0000000000000005 x4 : 0000000000000000 [ 0.028628] x3 : 0000000000000000 x2 : 000000000000017f [ 0.028840] x1 : ffff80003d870000 x0 : 0000000000000035 [ 0.029056] [ 0.029152] ---[ end trace 0000000000000000 ]--- [ 0.029297] Call trace: [ 0.029403] Exception stack(0xffff80003d847b00 to 0xffff80003d847c30) [ 0.029621] 7b00: 0000000000000030 0001000000000000 ffff80003d847cd0 ffff00000818e678 [ 0.029901] 7b20: 0000000000000002 0000000000000004 ffff000008f7c060 0000000000000035 [ 0.030153] 7b40: ffff000008f79000 ffff000008c4cd88 ffff80003d847bf0 ffff000008101778 [ 0.030402] 7b60: 0000000000000030 0000000000000000 ffff000008e97000 00000000024000c0 [ 0.030647] 7b80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.030895] 7ba0: 0000000000000035 ffff80003d870000 000000000000017f 0000000000000000 [ 0.031144] 7bc0: 0000000000000000 0000000000000005 ffff000008f79c84 6120757063726570 [ 0.031394] 7be0: 6f697461636f6c6c ffff80003d847a40 0000000000000042 0000000000000006 [ 0.031643] 7c00: 0000000000000041 ffff000008f79c4d ffff000088f79c3f 0000000000000006 [ 0.031877] 7c20: 00000000deadbeef 0000000000000a3f [ 0.032051] [] pcpu_alloc+0x88/0x6c0 [ 0.032229] [] __alloc_percpu+0x18/0x20 [ 0.032409] [] xen_guest_init+0x174/0x2f4 [ 0.032591] [] do_one_initcall+0x38/0x130 [ 0.032783] [] kernel_init_freeable+0xe0/0x248 [ 0.032995] [] kernel_init+0x10/0x100 [ 0.033172] [] ret_from_fork+0x10/0x50 Reported-by: Wei Chen Link: https://lkml.org/lkml/2016/11/28/669 Signed-off-by: Julien Grall Signed-off-by: Stefano Stabellini Reviewed-by: Stefano Stabellini Signed-off-by: Greg Kroah-Hartman commit 45394bf3e11ea50a704f0c7cb97bfe87af3559f0 Author: Boris Ostrovsky Date: Mon Nov 21 09:56:06 2016 -0500 xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing commit 30faaafdfa0c754c91bac60f216c9f34a2bfdf7e upstream. Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to NUMA balancing") set VM_IO flag to prevent grant maps from being subjected to NUMA balancing. It was discovered recently that this flag causes get_user_pages() to always fail with -EFAULT. check_vma_flags __get_user_pages __get_user_pages_locked __get_user_pages_unlocked get_user_pages_fast iov_iter_get_pages dio_refill_pages do_direct_IO do_blockdev_direct_IO do_blockdev_direct_IO ext4_direct_IO_read generic_file_read_iter aio_run_iocb (which can happen if guest's vdisk has direct-io-safe option). To avoid this let's use VM_MIXEDMAP flag instead --- it prevents NUMA balancing just as VM_IO does and has no effect on check_vma_flags(). Reported-by: Olaf Hering Suggested-by: Hugh Dickins Signed-off-by: Boris Ostrovsky Acked-by: Hugh Dickins Tested-by: Olaf Hering Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman commit b7bbf06c21aac5c91d3c7f73db291648232dfb39 Author: Jason Gunthorpe Date: Wed Oct 26 16:28:45 2016 -0600 tpm xen: Remove bogus tpm_chip_unregister commit 1f0f30e404b3d8f4597a2d9b77fba55452f8fd0e upstream. tpm_chip_unregister can only be called after tpm_chip_register. devm manages the allocation so no unwind is needed here. Fixes: afb5abc262e96 ("tpm: two-phase chip management functions") Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit f726f4f411f9094a3901fb35985f070c0fdec5b2 Author: Douglas Anderson Date: Wed Dec 14 15:05:49 2016 -0800 kernel/debug/debug_core.c: more properly delay for secondary CPUs commit 2d13bb6494c807bcf3f78af0e96c0b8615a94385 upstream. We've got a delay loop waiting for secondary CPUs. That loop uses loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how many tight loops make up a jiffy on all architectures. It is quite common to see things like this in the boot log: Calibrating delay loop (skipped), value calculated using timer frequency.. 48.00 BogoMIPS (lpj=24000) In my case I was seeing lots of cases where other CPUs timed out entering the debugger only to print their stack crawls shortly after the kdb> prompt was written. Elsewhere in kgdb we already use udelay(), so that should be safe enough to use to implement our timeout. We'll delay 1 ms for 1000 times, which should give us a full second of delay (just like the old code wanted) but allow us to notice that we're done every 1 ms. [akpm@linux-foundation.org: simplifications, per Daniel] Link: http://lkml.kernel.org/r/1477091361-2039-1-git-send-email-dianders@chromium.org Signed-off-by: Douglas Anderson Reviewed-by: Daniel Thompson Cc: Jason Wessel Cc: Brian Norris Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 9b78d69054243a0d047ebccdd64cb8a2462e174e Author: Christian Lamparter Date: Mon Nov 14 02:11:16 2016 +0100 watchdog: qcom: fix kernel panic due to external abort on non-linefetch commit f06f35c66fdbd5ac38901a3305ce763a0cd59375 upstream. This patch fixes a off-by-one in the "watchdog: qcom: add option for standalone watchdog not in timer block" patch that causes the following panic on boot: > Unhandled fault: external abort on non-linefetch (0x1008) at 0xc8874002 > pgd = c0204000 > [c8874002] *pgd=87806811, *pte=0b017653, *ppte=0b017453 > Internal error: : 1008 [#1] SMP ARM > CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.8.6 #0 > Hardware name: Generic DT based system > PC is at 0xc02222f4 > LR is at 0x1 > pc : [] lr : [<00000001>] psr: 00000113 > sp : c782fc98 ip : 00000003 fp : 00000000 > r10: 00000004 r9 : c782e000 r8 : c04ab98c > r7 : 00000001 r6 : c8874002 r5 : c782fe00 r4 : 00000002 > r3 : 00000000 r2 : c782fe00 r1 : 00100000 r0 : c8874002 > Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none > Control: 10c5387d Table: 8020406a DAC: 00000051 > Process swapper/0 (pid: 1, stack limit = 0xc782e210) > Stack: (0xc782fc98 to 0xc7830000) > [...] The WDT_STS (status) needs to be translated via wdt_addr as well. fixes: f0d9d0f4b44a ("watchdog: qcom: add option for standalone watchdog not in timer block") Signed-off-by: Christian Lamparter Reviewed-by: Guenter Roeck Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 2eccf0e0bcb1b164b93d83b4578a1f39ace3fff6 Author: Alexander Usyskin Date: Tue Nov 8 17:55:52 2016 +0200 watchdog: mei_wdt: request stop on reboot to prevent false positive event commit 9eff1140a82db8c5520f76e51c21827b4af670b3 upstream. Systemd on reboot enables shutdown watchdog that leaves the watchdog device open to ensure that even if power down process get stuck the platform reboots nonetheless. The iamt_wdt is an alarm-only watchdog and can't reboot system, but the FW will generate an alarm event reboot was completed in time, as the watchdog is not automatically disabled during power cycle. So we should request stop watchdog on reboot to eliminate wrong alarm from the FW. Signed-off-by: Alexander Usyskin Signed-off-by: Tomas Winkler Reviewed-by: Guenter Roeck Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 36b08b819713e587fa43b94516302dd8732dd27a Author: Konstantin Khlebnikov Date: Wed Dec 14 15:04:04 2016 -0800 kernel/watchdog: use nmi registers snapshot in hardlockup handler commit 4d1f0fb096aedea7bb5489af93498a82e467c480 upstream. NMI handler doesn't call set_irq_regs(), it's set only by normal IRQ. Thus get_irq_regs() returns NULL or stale registers snapshot with IP/SP pointing to the code interrupted by IRQ which was interrupted by NMI. NULL isn't a problem: in this case watchdog calls dump_stack() and prints full stack trace including NMI. But if we're stuck in IRQ handler then NMI watchlog will print stack trace without IRQ part at all. This patch uses registers snapshot passed into NMI handler as arguments: these registers point exactly to the instruction interrupted by NMI. Fixes: 55537871ef66 ("kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup") Link: http://lkml.kernel.org/r/146771764784.86724.6006627197118544150.stgit@buzz Signed-off-by: Konstantin Khlebnikov Cc: Jiri Kosina Cc: Ulrich Obergfell Cc: Aaron Tomlin Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c954acc0007bf2a33a34505e21bc5d617a2fe2aa Author: Pavel Shilovsky Date: Tue Nov 29 16:14:43 2016 -0800 CIFS: Fix a possible memory corruption in push locks commit e3d240e9d505fc67f8f8735836df97a794bbd946 upstream. If maxBuf is not 0 but less than a size of SMB2 lock structure we can end up with a memory corruption. Signed-off-by: Pavel Shilovsky Signed-off-by: Greg Kroah-Hartman commit 9f9d98246e5f7e70cf661be2253c75f4185b60a6 Author: Pavel Shilovsky Date: Wed Nov 16 15:17:15 2016 -0800 CIFS: Decrease verbosity of ioctl call commit b0a752b5ce76bd1d8b733a53757c3263511dcb69 upstream. Reviewed-by: Aurelien Aptel Acked-by: Sachin Prabhu Signed-off-by: Pavel Shilovsky Signed-off-by: Greg Kroah-Hartman commit 46890ffba1d62550c45f4412378daca9586ba51b Author: Pavel Shilovsky Date: Tue Nov 29 11:31:23 2016 -0800 CIFS: Fix a possible double locking of mutex during reconnect commit 96a988ffeb90dba33a71c3826086fe67c897a183 upstream. With the current code it is possible to lock a mutex twice when a subsequent reconnects are triggered. On the 1st reconnect we reconnect sessions and tcons and then persistent file handles. If the 2nd reconnect happens during the reconnecting of persistent file handles then the following sequence of calls is observed: cifs_reopen_file -> SMB2_open -> small_smb2_init -> smb2_reconnect -> cifs_reopen_persistent_file_handles -> cifs_reopen_file (again!). So, we are trying to acquire the same cfile->fh_mutex twice which is wrong. Fix this by moving reconnecting of persistent handles to the delayed work (smb2_reconnect_server) and submitting this work every time we reconnect tcon in SMB2 commands handling codepath. This can also lead to corruption of a temporary file list in cifs_reopen_persistent_file_handles() because we can recursively call this function twice. Signed-off-by: Pavel Shilovsky Signed-off-by: Greg Kroah-Hartman commit 69d13b69e79cb76413b49a66e43a2be39f14aefe Author: Pavel Shilovsky Date: Tue Nov 29 11:30:58 2016 -0800 CIFS: Fix missing nls unload in smb2_reconnect() commit 4772c79599564bd08ee6682715a7d3516f67433f upstream. Acked-by: Sachin Prabhu Signed-off-by: Pavel Shilovsky Signed-off-by: Greg Kroah-Hartman commit 48f9526f4dcb4b132fe0dc2450835311e3b013a6 Author: Pavel Shilovsky Date: Fri Nov 4 11:50:31 2016 -0700 CIFS: Fix a possible memory corruption during reconnect commit 53e0e11efe9289535b060a51d4cf37c25e0d0f2b upstream. We can not unlock/lock cifs_tcp_ses_lock while walking through ses and tcon lists because it can corrupt list iterator pointers and a tcon structure can be released if we don't hold an extra reference. Fix it by moving a reconnect process to a separate delayed work and acquiring a reference to every tcon that needs to be reconnected. Also do not send an echo request on newly established connections. Signed-off-by: Pavel Shilovsky Signed-off-by: Greg Kroah-Hartman commit 7aa58e7ad53bd9536aa49a18ccd0778c728bf57d Author: Andy Lutomirski Date: Mon Dec 12 12:54:37 2016 -0800 cifs: Fix smbencrypt() to stop pointing a scatterlist at the stack commit 06deeec77a5a689cc94b21a8a91a76e42176685d upstream. smbencrypt() points a scatterlist to the stack, which is breaks if CONFIG_VMAP_STACK=y. Fix it by switching to crypto_cipher_encrypt_one(). The new code should be considerably faster as an added benefit. This code is nearly identical to some code that Eric Biggers suggested. Reported-by: Eric Biggers Signed-off-by: Andy Lutomirski Acked-by: Jeff Layton Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 41c856b329008609157199552e767a7a02d62cd1 Author: Takashi Iwai Date: Fri Nov 25 16:54:06 2016 +0100 ASoC: intel: Fix crash at suspend/resume without card registration commit 2fc995a87f2efcd803438f07bfecd35cc3d90d32 upstream. When ASoC Intel SST Medfield driver is probed but without codec / card assigned, it causes an Oops and freezes the kernel at suspend/resume, PM: Suspending system (freeze) Suspending console(s) (use no_console_suspend to debug) BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [] sst_soc_prepare+0x19/0xa0 [snd_soc_sst_mfld_platform] Oops: 0000 [#1] PREEMPT SMP CPU: 0 PID: 1552 Comm: systemd-sleep Tainted: G W 4.9.0-rc6-1.g5f5c2ad-default #1 Call Trace: [] dpm_prepare+0x209/0x460 [] dpm_suspend_start+0x11/0x60 [] suspend_devices_and_enter+0xb2/0x710 [] pm_suspend+0x30e/0x390 [] state_store+0x8a/0x90 [] kobj_attr_store+0xf/0x20 [] sysfs_kf_write+0x37/0x40 [] kernfs_fop_write+0x11c/0x1b0 [] __vfs_write+0x28/0x140 [] ? apparmor_file_permission+0x18/0x20 [] ? security_file_permission+0x3b/0xc0 [] vfs_write+0xb5/0x1a0 [] SyS_write+0x46/0xa0 [] entry_SYSCALL_64_fastpath+0x1e/0xad Add proper NULL checks in the PM code of mdfld driver. Signed-off-by: Takashi Iwai Acked-by: Vinod Koul Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit f5dca4881fac42e2db76b1d93a3b3c0895ba49bb Author: Benjamin Marzinski Date: Wed Nov 30 17:56:14 2016 -0600 dm space map metadata: fix 'struct sm_metadata' leak on failed create commit 314c25c56c1ee5026cf99c570bdfe01847927acb upstream. In dm_sm_metadata_create() we temporarily change the dm_space_map operations from 'ops' (whose .destroy function deallocates the sm_metadata) to 'bootstrap_ops' (whose .destroy function doesn't). If dm_sm_metadata_create() fails in sm_ll_new_metadata() or sm_ll_extend(), it exits back to dm_tm_create_internal(), which calls dm_sm_destroy() with the intention of freeing the sm_metadata, but it doesn't (because the dm_space_map operations is still set to 'bootstrap_ops'). Fix this by setting the dm_space_map operations back to 'ops' if dm_sm_metadata_create() fails when it is set to 'bootstrap_ops'. Signed-off-by: Benjamin Marzinski Acked-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 461f272954cf7a86ce60963bc8b95a9f123b4036 Author: Heinz Mauelshagen Date: Tue Nov 29 22:37:30 2016 +0100 dm raid: fix discard support regression commit 11e2968478edc07a75ee1efb45011b3033c621c2 upstream. Commit ecbfb9f118 ("dm raid: add raid level takeover support") moved the configure_discard_support() call from raid_ctr() to raid_preresume(). Enabling/disabling discard _must_ happen during table load (through the .ctr hook). Fix this regression by moving the configure_discard_support() call back to raid_ctr(). Fixes: ecbfb9f118 ("dm raid: add raid level takeover support") Signed-off-by: Heinz Mauelshagen Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit e362c317ba76c9d397238737deef952187a21395 Author: Bart Van Assche Date: Fri Nov 11 17:05:27 2016 -0800 dm rq: fix a race condition in rq_completed() commit d15bb3a6467e102e60d954aadda5fb19ce6fd8ec upstream. It is required to hold the queue lock when calling blk_run_queue_async() to avoid that a race between blk_run_queue_async() and blk_cleanup_queue() is triggered. Signed-off-by: Bart Van Assche Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 2c017f77e13d4325d8739fc9ed7ed2c3629da845 Author: Ondrej Kozina Date: Wed Nov 2 15:02:08 2016 +0100 dm crypt: mark key as invalid until properly loaded commit 265e9098bac02bc5e36cda21fdbad34cb5b2f48d upstream. In crypt_set_key(), if a failure occurs while replacing the old key (e.g. tfm->setkey() fails) the key must not have DM_CRYPT_KEY_VALID flag set. Otherwise, the crypto layer would have an invalid key that still has DM_CRYPT_KEY_VALID flag set. Signed-off-by: Ondrej Kozina Reviewed-by: Mikulas Patocka Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 3fae2a9e994b41603dbe1a5f32503cf558cf77f7 Author: Wei Yongjun Date: Mon Aug 8 14:09:27 2016 +0000 dm flakey: return -EINVAL on interval bounds error in flakey_ctr() commit bff7e067ee518f9ed7e1cbc63e4c9e01670d0b71 upstream. Fix to return error code -EINVAL instead of 0, as is done elsewhere in this function. Fixes: e80d1c805a3b ("dm: do not override error code returned from dm_get_device()") Signed-off-by: Wei Yongjun Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit e74fb822281ecd034a381b0c3dd1b9d63caf1f25 Author: Bart Van Assche Date: Wed Dec 7 16:56:06 2016 -0800 dm table: an 'all_blk_mq' table must be loaded for a blk-mq DM device commit 301fc3f5efb98633115bd887655b19f42c6dfaa8 upstream. When dm_table_set_type() is used by a target to establish a DM table's type (e.g. DM_TYPE_MQ_REQUEST_BASED in the case of DM multipath) the DM core must go on to verify that the devices in the table are compatible with the established type. Fixes: e83068a5 ("dm mpath: add optional "queue_mode" feature") Signed-off-by: Bart Van Assche Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 470b6910f7c1d492884fc4c741bd1011a8ac4205 Author: Mike Snitzer Date: Wed Nov 23 13:51:09 2016 -0500 dm table: fix 'all_blk_mq' inconsistency when an empty table is loaded commit 6936c12cf809850180b24947271b8f068fdb15e9 upstream. An earlier DM multipath table could have been build ontop of underlying devices that were all using blk-mq. In that case, if that active multipath table is replaced with an empty DM multipath table (that reflects all paths have failed) then it is important that the 'all_blk_mq' state of the active table is transfered to the new empty DM table. Otherwise dm-rq.c:dm_old_prep_tio() will incorrectly clone a request that isn't needed by the DM multipath target when it is to issue IO to an underlying blk-mq device. Fixes: e83068a5 ("dm mpath: add optional "queue_mode" feature") Reported-by: Bart Van Assche Tested-by: Bart Van Assche Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 67b0069a5175c14ddcf12c431bcf400ca23f9931 Author: Bart Van Assche Date: Fri Oct 28 17:18:48 2016 -0700 blk-mq: Do not invoke .queue_rq() for a stopped queue commit bc27c01b5c46d3bfec42c96537c7a3fae0bb2cc4 upstream. The meaning of the BLK_MQ_S_STOPPED flag is "do not call .queue_rq()". Hence modify blk_mq_make_request() such that requests are queued instead of issued if a queue has been stopped. Reported-by: Ming Lei Signed-off-by: Bart Van Assche Reviewed-by: Christoph Hellwig Reviewed-by: Ming Lei Reviewed-by: Hannes Reinecke Reviewed-by: Johannes Thumshirn Reviewed-by: Sagi Grimberg Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 7ac62bcde2d417777a40ced09d958f99ed9009c9 Author: Viresh Kumar Date: Thu Dec 1 16:28:16 2016 +0530 PM / OPP: Don't use OPP structure outside of rcu protected section commit dc39d06fcd7a4a82d72eae7b71e94e888b96d29e upstream. The OPP structure must not be used out of the rcu protected section. Cache the values to be used in separate variables instead. Signed-off-by: Viresh Kumar Reviewed-by: Stephen Boyd Tested-by: Dave Gerlach Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit c7a8a0ac8fee26d3c20402da306a17bcbbbb367b Author: Stephen Boyd Date: Wed Nov 30 16:21:25 2016 +0530 PM / OPP: Pass opp_table to dev_pm_opp_put_regulator() commit 91291d9ad92faa65a56a9a19d658d8049b78d3d4 upstream. Joonyoung Shim reported an interesting problem on his ARM octa-core Odoroid-XU3 platform. During system suspend, dev_pm_opp_put_regulator() was failing for a struct device for which dev_pm_opp_set_regulator() is called earlier. This happened because an earlier call to dev_pm_opp_of_cpumask_remove_table() function (from cpufreq-dt.c file) removed all the entries from opp_table->dev_list apart from the last CPU device in the cpumask of CPUs sharing the OPP. But both dev_pm_opp_set_regulator() and dev_pm_opp_put_regulator() routines get CPU device for the first CPU in the cpumask. And so the OPP core failed to find the OPP table for the struct device. This patch attempts to fix this problem by returning a pointer to the opp_table from dev_pm_opp_set_regulator() and using that as the parameter to dev_pm_opp_put_regulator(). This ensures that the dev_pm_opp_put_regulator() doesn't fail to find the opp table. Note that similar design problem also exists with other dev_pm_opp_put_*() APIs, but those aren't used currently by anyone and so we don't need to update them for now. Reported-by: Joonyoung Shim Signed-off-by: Stephen Boyd Signed-off-by: Viresh Kumar [ Viresh: Wrote commit log and tested on exynos 5250 ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit eab1c4e2d0ad4509ccb8476a604074547dc202e0 Author: Felipe Balbi Date: Wed Sep 28 12:33:31 2016 +0300 usb: gadget: composite: always set ep->mult to a sensible value commit eaa496ffaaf19591fe471a36cef366146eeb9153 upstream. ep->mult is supposed to be set to Isochronous and Interrupt Endapoint's multiplier value. This value is computed from different places depending on the link speed. If we're dealing with HighSpeed, then it's part of bits [12:11] of wMaxPacketSize. This case wasn't taken into consideration before. While at that, also make sure the ep->mult defaults to one so drivers can use it unconditionally and assume they'll never multiply ep->maxpacket to zero. Cc: Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 44919a2ac4c6eca9f6e076e4485e06f251101a9a Author: Mel Gorman Date: Mon Dec 12 16:44:41 2016 -0800 mm, page_alloc: keep pcp count and list contents in sync if struct page is corrupted commit a6de734bc002fe2027ccc074fbbd87d72957b7a4 upstream. Vlastimil Babka pointed out that commit 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") will allow the per-cpu list counter to be out of sync with the per-cpu list contents if a struct page is corrupted. The consequence is an infinite loop if the per-cpu lists get fully drained by free_pcppages_bulk because all the lists are empty but the count is positive. The infinite loop occurs here do { batch_free++; if (++migratetype == MIGRATE_PCPTYPES) migratetype = 0; list = &pcp->lists[migratetype]; } while (list_empty(list)); What the user sees is a bad page warning followed by a soft lockup with interrupts disabled in free_pcppages_bulk(). This patch keeps the accounting in sync. Fixes: 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") Link: http://lkml.kernel.org/r/20161202112951.23346-2-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Acked-by: Michal Hocko Acked-by: Hillf Danton Cc: Christoph Lameter Cc: Johannes Weiner Cc: Jesper Dangaard Brouer Cc: Joonsoo Kim Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 04597beae7c23ccefd4502c0b9296d68292b807f Author: Shaohua Li Date: Mon Dec 12 16:41:50 2016 -0800 mm/vmscan.c: set correct defer count for shrinker commit 5f33a0803bbd781de916f5c7448cbbbbc763d911 upstream. Our system uses significantly more slab memory with memcg enabled with the latest kernel. With 3.10 kernel, slab uses 2G memory, while with 4.6 kernel, 6G memory is used. The shrinker has problem. Let's see we have two memcg for one shrinker. In do_shrink_slab: 1. Check cg1. nr_deferred = 0, assume total_scan = 700. batch size is 1024, then no memory is freed. nr_deferred = 700 2. Check cg2. nr_deferred = 700. Assume freeable = 20, then total_scan = 10 or 40. Let's assume it's 10. No memory is freed. nr_deferred = 10. The deferred share of cg1 is lost in this case. kswapd will free no memory even run above steps again and again. The fix makes sure one memcg's deferred share isn't lost. Link: http://lkml.kernel.org/r/2414be961b5d25892060315fbb56bb19d81d0c07.1476227351.git.shli@fb.com Signed-off-by: Shaohua Li Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit fe3d462821b02df53e5c99b2c9b059824adbac0a Author: Solganik Alexander Date: Sun Oct 30 10:35:15 2016 +0200 nvmet: Fix possible infinite loop triggered on hot namespace removal commit e4fcf07cca6a3b6c4be00df16f08be894325eaa3 upstream. When removing a namespace we delete it from the subsystem namespaces list with list_del_init which allows us to know if it is enabled or not. The problem is that list_del_init initialize the list next and does not respect the RCU list-traversal we do on the IO path for locating a namespace. Instead we need to use list_del_rcu which is allowed to run concurrently with the _rcu list-traversal primitives (keeps list next intact) and guarantees concurrent nvmet_find_naespace forward progress. By changing that, we cannot rely on ns->dev_link for knowing if the namspace is enabled, so add enabled indicator entry to nvmet_ns for that. Signed-off-by: Sagi Grimberg Signed-off-by: Solganik Alexander Signed-off-by: Greg Kroah-Hartman commit 890c39d35eb070eccad71564f55f4910668b25c4 Author: Omar Sandoval Date: Mon Nov 14 14:56:17 2016 -0800 loop: return proper error from loop_queue_rq() commit b4a567e8114327518c09f5632339a5954ab975a3 upstream. ->queue_rq() should return one of the BLK_MQ_RQ_QUEUE_* constants, not an errno. Fixes: f4aa4c7bbac6 ("block: loop: convert to per-device workqueue") Signed-off-by: Omar Sandoval Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 01e15b3328c4dc2073bc5080f1b31f21370be3f8 Author: Jaegeuk Kim Date: Thu Nov 24 12:45:15 2016 -0800 f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack commit 8508e44ae98622f841f5ef29d0bf3d5db4e0c1cc upstream. We don't guarantee cp_addr is fixed by cp_version. This is to sync with f2fs-tools. Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit 027611ef345d118310c1334b5d25b27ff95da684 Author: Jaegeuk Kim Date: Wed Nov 23 10:51:17 2016 -0800 f2fs: fix overflow due to condition check order commit e87f7329bbd6760c2acc4f1eb423362b08851a71 upstream. In the last ilen case, i was already increased, resulting in accessing out- of-boundary entry of do_replace and blkaddr. Fix to check ilen first to exit the loop. Fixes: 2aa8fbb9693020 ("f2fs: refactor __exchange_data_block for speed up") Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit 1134ef11ffffcfb69834c86f632c0ce0c639524a Author: Nicolai Stange Date: Sun Nov 20 19:57:23 2016 +0100 f2fs: set ->owner for debugfs status file's file_operations commit 05e6ea2685c964db1e675a24a4f4e2adc22d2388 upstream. The struct file_operations instance serving the f2fs/status debugfs file lacks an initialization of its ->owner. This means that although that file might have been opened, the f2fs module can still get removed. Any further operation on that opened file, releasing included, will cause accesses to unmapped memory. Indeed, Mike Marshall reported the following: BUG: unable to handle kernel paging request at ffffffffa0307430 IP: [] full_proxy_release+0x24/0x90 <...> Call Trace: [] __fput+0xdf/0x1d0 [] ____fput+0xe/0x10 [] task_work_run+0x8e/0xc0 [] do_exit+0x2ae/0xae0 [] ? __audit_syscall_entry+0xae/0x100 [] ? syscall_trace_enter+0x1ca/0x310 [] do_group_exit+0x44/0xc0 [] SyS_exit_group+0x14/0x20 [] do_syscall_64+0x61/0x150 [] entry_SYSCALL64_slow_path+0x25/0x25 <...> ---[ end trace f22ae883fa3ea6b8 ]--- Fixing recursive fault but reboot is needed! Fix this by initializing the f2fs/status file_operations' ->owner with THIS_MODULE. This will allow debugfs to grab a reference to the f2fs module upon any open on that file, thus preventing it from getting removed. Fixes: 902829aa0b72 ("f2fs: move proc files to debugfs") Reported-by: Mike Marshall Reported-by: Martin Brandenburg Signed-off-by: Nicolai Stange Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit a43e1c459a3dc6e2d44ae53c7f5b107006663cde Author: Jaegeuk Kim Date: Fri Dec 2 15:11:32 2016 -0800 Revert "f2fs: use percpu_counter for # of dirty pages in inode" commit 204706c7accfabb67b97eef9f9a28361b6201199 upstream. This reverts commit 1beba1b3a953107c3ff5448ab4e4297db4619c76. The perpcu_counter doesn't provide atomicity in single core and consume more DRAM. That incurs fs_mark test failure due to ENOMEM. Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit 9abce3ca80a735218854cd5ad4b3cc567ed1b9c9 Author: Sergey Karamov Date: Sat Dec 10 17:54:58 2016 -0500 ext4: do not perform data journaling when data is encrypted commit 73b92a2a5e97d17cc4d5c4fe9d724d3273fb6fd2 upstream. Currently data journalling is incompatible with encryption: enabling both at the same time has never been supported by design, and would result in unpredictable behavior. However, users are not precluded from turning on both features simultaneously. This change programmatically replaces data journaling for encrypted regular files with ordered data journaling mode. Background: Journaling encrypted data has not been supported because it operates on buffer heads of the page in the page cache. Namely, when the commit happens, which could be up to five seconds after caching, the commit thread uses the buffer heads attached to the page to copy the contents of the page to the journal. With encryption, it would have been required to keep the bounce buffer with ciphertext for up to the aforementioned five seconds, since the page cache can only hold plaintext and could not be used for journaling. Alternatively, it would be required to setup the journal to initiate a callback at the commit time to perform deferred encryption - in this case, not only would the data have to be written twice, but it would also have to be encrypted twice. This level of complexity was not justified for a mode that in practice is very rarely used because of the overhead from the data journalling. Solution: If data=journaled has been set as a mount option for a filesystem, or if journaling is enabled on a regular file, do not perform journaling if the file is also encrypted, instead fall back to the data=ordered mode for the file. Rationale: The intent is to allow seamless and proper filesystem operation when journaling and encryption have both been enabled, and have these two conflicting features gracefully resolved by the filesystem. Fixes: 4461471107b7 Signed-off-by: Sergey Karamov Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit acf3efd6f0037582209cc589392a06ed26d9032b Author: Dan Carpenter Date: Sat Dec 10 09:56:01 2016 -0500 ext4: return -ENOMEM instead of success commit 578620f451f836389424833f1454eeeb2ffc9e9f upstream. We should set the error code if kzalloc() fails. Fixes: 67cf5b09a46f ("ext4: add the basic function for inline data support") Signed-off-by: Dan Carpenter Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 3e4f8da9d177b447fdd5e82c94f66bac38aa3045 Author: Darrick J. Wong Date: Sat Dec 10 09:55:01 2016 -0500 ext4: reject inodes with negative size commit 7e6e1ef48fc02f3ac5d0edecbb0c6087cd758d58 upstream. Don't load an inode with a negative size; this causes integer overflow problems in the VFS. [ Added EXT4_ERROR_INODE() to mark file system as corrupted. -TYT] Fixes: a48380f769df (ext4: rename i_dir_acl to i_size_high) Signed-off-by: Darrick J. Wong Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 8084f57bc46894dbe08dac2a09d66553d7513a34 Author: Theodore Ts'o Date: Fri Nov 18 13:37:47 2016 -0500 ext4: add sanity checking to count_overhead() commit c48ae41bafe31e9a66d8be2ced4e42a6b57fa814 upstream. The commit "ext4: sanity check the block and cluster size at mount time" should prevent any problems, but in case the superblock is modified while the file system is mounted, add an extra safety check to make sure we won't overrun the allocated buffer. Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 956e2a0e6779824877822864ad486f08184f1a3e Author: Theodore Ts'o Date: Fri Nov 18 13:24:26 2016 -0500 ext4: fix in-superblock mount options processing commit 5aee0f8a3f42c94c5012f1673420aee96315925a upstream. Fix a large number of problems with how we handle mount options in the superblock. For one, if the string in the superblock is long enough that it is not null terminated, we could run off the end of the string and try to interpret superblocks fields as characters. It's unlikely this will cause a security problem, but it could result in an invalid parse. Also, parse_options is destructive to the string, so in some cases if there is a comma-separated string, it would be modified in the superblock. (Fortunately it only happens on file systems with a 1k block size.) Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 01772f4683a9f92a06b3d2467c0084df2a512f35 Author: Theodore Ts'o Date: Fri Nov 18 13:28:30 2016 -0500 ext4: use more strict checks for inodes_per_block on mount commit cd6bb35bf7f6d7d922509bf50265383a0ceabe96 upstream. Centralize the checks for inodes_per_block and be more strict to make sure the inodes_per_block_group can't end up being zero. Signed-off-by: Theodore Ts'o Reviewed-by: Andreas Dilger Signed-off-by: Greg Kroah-Hartman commit b493c715cdce57c803413e635e19f34f96041430 Author: Chandan Rajendra Date: Mon Nov 14 21:26:26 2016 -0500 ext4: fix stack memory corruption with 64k block size commit 30a9d7afe70ed6bd9191d3000e2ef1a34fb58493 upstream. The number of 'counters' elements needed in 'struct sg' is super_block->s_blocksize_bits + 2. Presently we have 16 'counters' elements in the array. This is insufficient for block sizes >= 32k. In such cases the memcpy operation performed in ext4_mb_seq_groups_show() would cause stack memory corruption. Fixes: c9de560ded61f Signed-off-by: Chandan Rajendra Signed-off-by: Theodore Ts'o Reviewed-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit c3881abae6e7060d17beaa4f2af3190ff0937fd2 Author: Chandan Rajendra Date: Mon Nov 14 21:04:37 2016 -0500 ext4: fix mballoc breakage with 64k block size commit 69e43e8cc971a79dd1ee5d4343d8e63f82725123 upstream. 'border' variable is set to a value of 2 times the block size of the underlying filesystem. With 64k block size, the resulting value won't fit into a 16-bit variable. Hence this commit changes the data type of 'border' to 'unsigned int'. Fixes: c9de560ded61f Signed-off-by: Chandan Rajendra Signed-off-by: Theodore Ts'o Reviewed-by: Andreas Dilger Signed-off-by: Greg Kroah-Hartman commit 24d1251a5d83ee121d978f3e7cb5c19ffb4272b9 Author: Theodore Ts'o Date: Sun Nov 13 22:02:29 2016 -0500 ext4: don't lock buffer in ext4_commit_super if holding spinlock commit 1566a48aaa10c6bb29b9a69dd8279f9a4fc41e35 upstream. If there is an error reported in mballoc via ext4_grp_locked_error(), the code is holding a spinlock, so ext4_commit_super() must not try to lock the buffer head, or else it will trigger a BUG: BUG: sleeping function called from invalid context at ./include/linux/buffer_head.h:358 in_atomic(): 1, irqs_disabled(): 0, pid: 993, name: mount CPU: 0 PID: 993 Comm: mount Not tainted 4.9.0-rc1-clouder1 #62 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 ffff880006423548 ffffffff81318c89 ffffffff819ecdd0 0000000000000166 ffff880006423558 ffffffff810810b0 ffff880006423580 ffffffff81081153 ffff880006e5a1a0 ffff88000690e400 0000000000000000 ffff8800064235c0 Call Trace: [] dump_stack+0x67/0x9e [] ___might_sleep+0xf0/0x140 [] __might_sleep+0x53/0xb0 [] ext4_commit_super+0x19c/0x290 [] __ext4_grp_locked_error+0x14a/0x230 [] ? __might_sleep+0x53/0xb0 [] ext4_mb_generate_buddy+0x1de/0x320 Since ext4_grp_locked_error() calls ext4_commit_super with sync == 0 (and it is the only caller which does so), avoid locking and unlocking the buffer in this case. This can result in races with ext4_commit_super() if there are other problems (which is what commit 4743f83990614 was trying to address), but a Warning is better than BUG. Fixes: 4743f83990614 Reported-by: Nikolay Borisov Signed-off-by: Theodore Ts'o Reviewed-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 21cc91554c3da12708c9a85a98ce144409dc21ef Author: Alex Porosanu Date: Wed Nov 9 10:46:11 2016 +0200 crypto: caam - fix AEAD givenc descriptors commit d128af17876d79b87edf048303f98b35f6a53dbc upstream. The AEAD givenc descriptor relies on moving the IV through the output FIFO and then back to the CTX2 for authentication. The SEQ FIFO STORE could be scheduled before the data can be read from OFIFO, especially since the SEQ FIFO LOAD needs to wait for the SEQ FIFO LOAD SKIP to finish first. The SKIP takes more time when the input is SG than when it's a contiguous buffer. If the SEQ FIFO LOAD is not scheduled before the STORE, the DECO will hang waiting for data to be available in the OFIFO so it can be transferred to C2. In order to overcome this, first force transfer of IV to C2 by starting the "cryptlen" transfer first and then starting to store data from OFIFO to the output buffer. Fixes: 1acebad3d8db8 ("crypto: caam - faster aead implementation") Signed-off-by: Alex Porosanu Signed-off-by: Horia Geantă Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit e71b4e061c9677cef1f1f38fd7236e198fab1287 Author: Eric W. Biederman Date: Tue Nov 22 12:06:50 2016 -0600 ptrace: Don't allow accessing an undumpable mm commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3 upstream. It is the reasonable expectation that if an executable file is not readable there will be no way for a user without special privileges to read the file. This is enforced in ptrace_attach but if ptrace is already attached before exec there is no enforcement for read-only executables. As the only way to read such an mm is through access_process_vm spin a variant called ptrace_access_vm that will fail if the target process is not being ptraced by the current process, or the current process did not have sufficient privileges when ptracing began to read the target processes mm. In the ptrace implementations replace access_process_vm by ptrace_access_vm. There remain several ptrace sites that still use access_process_vm as they are reading the target executables instructions (for kernel consumption) or register stacks. As such it does not appear necessary to add a permission check to those calls. This bug has always existed in Linux. Fixes: v1.0 Reported-by: Andy Lutomirski Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit e747b4ae3b6bca205d82e86366e140cdcbfb7731 Author: Eric W. Biederman Date: Mon Nov 14 18:48:07 2016 -0600 ptrace: Capture the ptracer's creds not PT_PTRACE_CAP commit 64b875f7ac8a5d60a4e191479299e931ee949b67 upstream. When the flag PT_PTRACE_CAP was added the PTRACE_TRACEME path was overlooked. This can result in incorrect behavior when an application like strace traces an exec of a setuid executable. Further PT_PTRACE_CAP does not have enough information for making good security decisions as it does not report which user namespace the capability is in. This has already allowed one mistake through insufficient granulariy. I found this issue when I was testing another corner case of exec and discovered that I could not get strace to set PT_PTRACE_CAP even when running strace as root with a full set of caps. This change fixes the above issue with strace allowing stracing as root a setuid executable without disabling setuid. More fundamentaly this change allows what is allowable at all times, by using the correct information in it's decision. Fixes: 4214e42f96d4 ("v2.4.9.11 -> v2.4.9.12") Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit 48466c4772d2cfeb331395043f7edd8c5324ef1b Author: Linus Torvalds Date: Wed Dec 14 12:45:25 2016 -0800 vfs,mm: fix return value of read() at s_maxbytes commit d05c5f7ba164aed3db02fb188c26d0dd94f5455b upstream. We truncated the possible read iterator to s_maxbytes in commit c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()"), but our end condition handling was wrong: it's not an error to try to read at the end of the file. Reading past the end should return EOF (0), not EINVAL. See for example https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1649342 http://lists.gnu.org/archive/html/bug-coreutils/2016-12/msg00008.html where a md5sum of a maximally sized file fails because the final read is exactly at s_maxbytes. Fixes: c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") Reported-by: Joseph Salisbury Cc: Wei Fang Cc: Christoph Hellwig Cc: Dave Chinner Cc: Al Viro Cc: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 694a95fa6dae4991f16cda333d897ea063021fed Author: Eric W. Biederman Date: Thu Oct 13 21:23:16 2016 -0500 mm: Add a user_ns owner to mm_struct and fix ptrace permission checks commit bfedb589252c01fa505ac9f6f2a3d5d68d707ef4 upstream. During exec dumpable is cleared if the file that is being executed is not readable by the user executing the file. A bug in ptrace_may_access allows reading the file if the executable happens to enter into a subordinate user namespace (aka clone(CLONE_NEWUSER), unshare(CLONE_NEWUSER), or setns(fd, CLONE_NEWUSER). This problem is fixed with only necessary userspace breakage by adding a user namespace owner to mm_struct, captured at the time of exec, so it is clear in which user namespace CAP_SYS_PTRACE must be present in to be able to safely give read permission to the executable. The function ptrace_may_access is modified to verify that the ptracer has CAP_SYS_ADMIN in task->mm->user_ns instead of task->cred->user_ns. This ensures that if the task changes it's cred into a subordinate user namespace it does not become ptraceable. The function ptrace_attach is modified to only set PT_PTRACE_CAP when CAP_SYS_PTRACE is held over task->mm->user_ns. The intent of PT_PTRACE_CAP is to be a flag to note that whatever permission changes the task might go through the tracer has sufficient permissions for it not to be an issue. task->cred->user_ns is always the same as or descendent of mm->user_ns. Which guarantees that having CAP_SYS_PTRACE over mm->user_ns is the worst case for the tasks credentials. To prevent regressions mm->dumpable and mm->user_ns are not considered when a task has no mm. As simply failing ptrace_may_attach causes regressions in privileged applications attempting to read things such as /proc//stat Acked-by: Kees Cook Tested-by: Cyrill Gorcunov Fixes: 8409cca70561 ("userns: allow ptrace from non-init user namespaces") Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit cfa2d65b2622d14b2f1368fbc9a6b4ab85141644 Author: NeilBrown Date: Mon Dec 12 08:21:51 2016 -0700 block_dev: don't test bdev->bd_contains when it is not stable commit bcc7f5b4bee8e327689a4d994022765855c807ff upstream. bdev->bd_contains is not stable before calling __blkdev_get(). When __blkdev_get() is called on a parition with ->bd_openers == 0 it sets bdev->bd_contains = bdev; which is not correct for a partition. After a call to __blkdev_get() succeeds, ->bd_openers will be > 0 and then ->bd_contains is stable. When FMODE_EXCL is used, blkdev_get() calls bd_start_claiming() -> bd_prepare_to_claim() -> bd_may_claim() This call happens before __blkdev_get() is called, so ->bd_contains is not stable. So bd_may_claim() cannot safely use ->bd_contains. It currently tries to use it, and this can lead to a BUG_ON(). This happens when a whole device is already open with a bd_holder (in use by dm in my particular example) and two threads race to open a partition of that device for the first time, one opening with O_EXCL and one without. The thread that doesn't use O_EXCL gets through blkdev_get() to __blkdev_get(), gains the ->bd_mutex, and sets bdev->bd_contains = bdev; Immediately thereafter the other thread, using FMODE_EXCL, calls bd_start_claiming() from blkdev_get(). This should fail because the whole device has a holder, but because bdev->bd_contains == bdev bd_may_claim() incorrectly reports success. This thread continues and blocks on bd_mutex. The first thread then sets bdev->bd_contains correctly and drops the mutex. The thread using FMODE_EXCL then continues and when it calls bd_may_claim() again in: BUG_ON(!bd_may_claim(bdev, whole, holder)); The BUG_ON fires. Fix this by removing the dependency on ->bd_contains in bd_may_claim(). As bd_may_claim() has direct access to the whole device, it can simply test if the target bdev is the whole device. Fixes: 6b4517a7913a ("block: implement bd_claiming and claiming block") Signed-off-by: NeilBrown Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit b6cce9b8e8139e8b9130dd2efe950bac0d4fea50 Author: Linus Torvalds Date: Wed Dec 21 10:59:34 2016 -0800 splice: reinstate SIGPIPE/EPIPE handling commit 52bce91165e5f2db422b2b972e83d389e5e4725c upstream. Commit 8924feff66f3 ("splice: lift pipe_lock out of splice_to_pipe()") caused a regression when there were no more readers left on a pipe that was being spliced into: rather than the expected SIGPIPE and -EPIPE return value, the writer would end up waiting forever for space to free up (which obviously was not going to happen with no readers around). Fixes: 8924feff66f3 ("splice: lift pipe_lock out of splice_to_pipe()") Reported-and-tested-by: Andreas Schwab Debugged-by: Al Viro Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c1df5a63716b0f73e95b1b26d878691e6175ea34 Author: Aleksa Sarai Date: Wed Dec 21 16:26:24 2016 +1100 fs: exec: apply CLOEXEC before changing dumpable task flags commit 613cc2b6f272c1a8ad33aefa21cad77af23139f7 upstream. If you have a process that has set itself to be non-dumpable, and it then undergoes exec(2), any CLOEXEC file descriptors it has open are "exposed" during a race window between the dumpable flags of the process being reset for exec(2) and CLOEXEC being applied to the file descriptors. This can be exploited by a process by attempting to access /proc//fd/... during this window, without requiring CAP_SYS_PTRACE. The race in question is after set_dumpable has been (for get_link, though the trace is basically the same for readlink): [vfs] -> proc_pid_link_inode_operations.get_link -> proc_pid_get_link -> proc_fd_access_allowed -> ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS); Which will return 0, during the race window and CLOEXEC file descriptors will still be open during this window because do_close_on_exec has not been called yet. As a result, the ordering of these calls should be reversed to avoid this race window. This is of particular concern to container runtimes, where joining a PID namespace with file descriptors referring to the host filesystem can result in security issues (since PRCTL_SET_DUMPABLE doesn't protect against access of CLOEXEC file descriptors -- file descriptors which may reference filesystem objects the container shouldn't have access to). Cc: dev@opencontainers.org Reported-by: Michael Crosby Signed-off-by: Aleksa Sarai Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 21245b8635e8636e9e01df06f33cf08f369322b7 Author: Eric W. Biederman Date: Wed Nov 16 22:06:51 2016 -0600 exec: Ensure mm->user_ns contains the execed files commit f84df2a6f268de584a201e8911384a2d244876e3 upstream. When the user namespace support was merged the need to prevent ptrace from revealing the contents of an unreadable executable was overlooked. Correct this oversight by ensuring that the executed file or files are in mm->user_ns, by adjusting mm->user_ns. Use the new function privileged_wrt_inode_uidgid to see if the executable is a member of the user namespace, and as such if having CAP_SYS_PTRACE in the user namespace should allow tracing the executable. If not update mm->user_ns to the parent user namespace until an appropriate parent is found. Reported-by: Jann Horn Fixes: 9e4a36ece652 ("userns: Fail exec for suid and sgid binaries with ids outside our user namespace.") Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit 0de98eef9c115ce0a23995d06bba32691297fbf8 Author: Richard Watts Date: Fri Dec 2 23:14:38 2016 +0200 clk: ti: omap36xx: Work around sprz319 advisory 2.1 commit 035cd485a47dda64f25ccf8a90b11a07d0b7aa7a upstream. The OMAP36xx DPLL5, driving EHCI USB, can be subject to a long-term frequency drift. The frequency drift magnitude depends on the VCO update rate, which is inversely proportional to the PLL divider. The kernel DPLL configuration code results in a high value for the divider, leading to a long term drift high enough to cause USB transmission errors. In the worst case the USB PHY's ULPI interface can stop responding, breaking USB operation completely. This manifests itself on the Beagleboard xM by the LAN9514 reporting 'Cannot enable port 2. Maybe the cable is bad?' in the kernel log. Errata sprz319 advisory 2.1 documents PLL values that minimize the drift. Use them automatically when DPLL5 is used for USB operation, which we detect based on the requested clock rate. The clock framework will still compute the PLL parameters and resulting rate as usual, but the PLL M and N values will then be overridden. This can result in the effective clock rate being slightly different than the rate cached by the clock framework, but won't cause any adverse effect to USB operation. Signed-off-by: Richard Watts [Upported from v3.2 to v4.9] Signed-off-by: Laurent Pinchart Tested-by: Ladislav Michl Signed-off-by: Stephen Boyd Cc: Adam Ford Signed-off-by: Greg Kroah-Hartman commit 0ce4f00087b476db4baf97344fffee4d84a6e9d3 Author: Kai-Heng Feng Date: Tue Dec 6 16:56:27 2016 +0800 ALSA: hda: when comparing pin configurations, ignore assoc in addition to seq commit 5e0ad0d8747f3e4803a9c3d96d64dd7332506d3c upstream. Commit [64047d7f4912 ALSA: hda - ignore the assoc and seq when comparing pin configurations] intented to ignore both seq and assoc at pin comparing, but it only ignored seq. So that commit may still fail to match pins on some machines. Change the bitmask to also ignore assoc. v2: Use macro to do bit masking. Thanks to Hui Wang for the analysis. Fixes: 64047d7f4912 ("ALSA: hda - ignore the assoc and seq when comparing...") Signed-off-by: Kai-Heng Feng Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit e029ef3a9c82ae05792173235f88d5e46db0f79c Author: Takashi Iwai Date: Tue Dec 6 11:55:17 2016 +0100 ALSA: hda - Gate the mic jack on HP Z1 Gen3 AiO commit f73cd43ac3b41c0f09a126387f302bbc0d9c726d upstream. HP Z1 Gen3 AiO with Conexant codec doesn't give an unsolicited event to the headset mic pin upon the jack plugging, it reports only to the headphone pin. It results in the missing mic switching. Let's fix up by simply gating the jack event. Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 0119d5d44034399478ae80349e810acb15aa5e1d Author: Hui Wang Date: Wed Nov 23 16:05:38 2016 +0800 ALSA: hda - fix headset-mic problem on a Dell laptop commit 989dbe4a30728c047316ab87e5fa8b609951ce7c upstream. This group of new pins is not in the pin quirk table yet, adding them to the pin quirk table to fix the headset-mic problem. Signed-off-by: Hui Wang Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 37b7c5db5a3065ce5d72ce7067d420273819f9a9 Author: Hui Wang Date: Wed Nov 23 16:05:37 2016 +0800 ALSA: hda - ignore the assoc and seq when comparing pin configurations commit 64047d7f4912de1769d1bf0d34c6322494b13779 upstream. More and more pin configurations have been adding to the pin quirk table, lots of them are only different from assoc and seq, but they all apply to the same QUIRK_FIXUP, if we don't compare assoc and seq when matching pin configurations, it will greatly reduce the pin quirk table size. We have tested this change on a couple of Dell laptops, it worked well. Signed-off-by: Hui Wang Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 0f1047be4a9f8411c8ed9ba1a4bb63bfa6659e1c Author: Sven Hahne Date: Fri Nov 25 14:16:43 2016 +0100 ALSA: hda/ca0132 - Add quirk for Alienware 15 R2 2016 commit b5337cfe067e96b8a98699da90c7dcd2bec21133 upstream. I'm using an Alienware 15 R2 and had to use the alienware quirks to get my headphone output working. I fixed it by adding, SND_PCI_QUIRK(0x1028, 0x0708, "Alienware 15 R2 2016", QUIRK_ALIENWARE) to the patch. Signed-off-by: Sven Hahne Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit fa2e770f88bcd50a34182792d06a1ff488cbac5c Author: Jussi Laako Date: Mon Nov 28 11:27:45 2016 +0200 ALSA: hiface: Fix M2Tech hiFace driver sampling rate change commit 995c6a7fd9b9212abdf01160f6ce3193176be503 upstream. Sampling rate changes after first set one are not reflected to the hardware, while driver and ALSA think the rate has been changed. Fix the problem by properly stopping the interface at the beginning of prepare call, allowing new rate to be set to the hardware. This keeps the hardware in sync with the driver. Signed-off-by: Jussi Laako Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 205d3de9637f253dfaeca8a7740af1e9d554c3c7 Author: Con Kolivas Date: Fri Dec 9 15:15:57 2016 +1100 ALSA: usb-audio: Add QuickCam Communicate Deluxe/S7500 to volume_control_quirks commit 82ffb6fc637150b279f49e174166d2aa3853eaf4 upstream. The Logitech QuickCam Communicate Deluxe/S7500 microphone fails with the following warning. [ 6.778995] usb 2-1.2.2.2: Warning! Unlikely big volume range (=3072), cval->res is probably wrong. [ 6.778996] usb 2-1.2.2.2: [5] FU [Mic Capture Volume] ch = 1, val = 4608/7680/1 Adding it to the list of devices in volume_control_quirks makes it work properly, fixing related typo. Signed-off-by: Con Kolivas Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 77bd73ce21fafe2ca80fe09b6e9dbebf93fd4cd6 Author: Krzysztof Opasiak Date: Thu Dec 1 19:14:37 2016 +0100 usbip: vudc: fix: Clear already_seen flag also for ep0 commit 3e448e13a662fb20145916636127995cbf37eb83 upstream. ep_list inside gadget structure doesn't contain ep0. It is stored separately in ep0 field. This causes an urb hang if gadget driver decides to delay setup handling. On host side this is visible as timeout error when setting configuration. This bug can be reproduced using for example any gadget with mass storage function. Fixes: abdb29574322 ("usbip: vudc: Add vudc_transfer") Signed-off-by: Krzysztof Opasiak Acked-by: Shuah Khan Signed-off-by: Greg Kroah-Hartman commit 420f170ce1ba999872f4be893807041b0d8db930 Author: Alan Stern Date: Fri Oct 21 16:49:07 2016 -0400 USB: UHCI: report non-PME wakeup signalling for Intel hardware commit ccdb6be9ec6580ef69f68949ebe26e0fb58a6fb0 upstream. The UHCI controllers in Intel chipsets rely on a platform-specific non-PME mechanism for wakeup signalling. They can generate wakeup signals even though they don't support PME. We need to let the USB core know this so that it will enable runtime suspend for UHCI controllers. Signed-off-by: Alan Stern Signed-off-by: Bjorn Helgaas Acked-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit e0aa5ec40d6effc891eb63740bb03f5f50cb539e Author: Felipe Balbi Date: Wed Sep 28 10:38:11 2016 +0300 usb: gadget: composite: correctly initialize ep->maxpacket commit e8f29bb719b47a234f33b0af62974d7a9521a52c upstream. usb_endpoint_maxp() returns wMaxPacketSize in its raw form. Without taking into consideration that it also contains other bits reserved for isochronous endpoints. This patch fixes one occasion where this is a problem by making sure that we initialize ep->maxpacket only with lower 10 bits of the value returned by usb_endpoint_maxp(). Note that seperate patches will be necessary to audit all call sites of usb_endpoint_maxp() and make sure that usb_endpoint_maxp() only returns lower 10 bits of wMaxPacketSize. Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 5180169dae85a3bd5b081529ba149374a0b3ffed Author: Peter Chen Date: Tue Nov 8 10:10:44 2016 +0800 usb: gadget: f_uac2: fix error handling at afunc_bind commit f1d3861d63a5d79b8968a02eea1dcb01bb684e62 upstream. The current error handling flow uses incorrect goto label, fix it Fixes: d12a8727171c ("usb: gadget: function: Remove redundant usb_free_all_descriptors") Signed-off-by: Peter Chen Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit eab169397ad687e69b3bde3547a1fd37e7f81194 Author: Rafał Miłecki Date: Tue Dec 6 00:39:33 2016 +0100 usb: core: usbport: Use proper LED API to fix potential crash commit 89778ba335e302a450932ce5b703c1ee6216e949 upstream. Calling brightness_set manually isn't safe as some LED drivers don't implement this callback. The best idea is to just use a proper helper which will fallback to the brightness_set_blocking callback if needed. This fixes: [ 1461.761528] Unable to handle kernel NULL pointer dereference at virtual address 00000000 (...) [ 1462.117049] Backtrace: [ 1462.119521] [] (usbport_trig_port_store [ledtrig_usbport]) from [] (dev_attr_store+0x20/0x2c) [ 1462.129826] r7:dcabc7c0 r6:dee0ff80 r5:00000002 r4:bf228164 [ 1462.135511] [] (dev_attr_store) from [] (sysfs_kf_write+0x48/0x4c) [ 1462.143459] r5:00000002 r4:c023f738 [ 1462.147049] [] (sysfs_kf_write) from [] (kernfs_fop_write+0xf8/0x1f8) [ 1462.155258] r5:00000002 r4:df4a1000 [ 1462.158850] [] (kernfs_fop_write) from [] (__vfs_write+0x34/0x120) [ 1462.166800] r10:00000000 r9:dee0e000 r8:c000fc24 r7:00000002 r6:dee0ff80 r5:c01689c0 [ 1462.174660] r4:df727a80 [ 1462.177204] [] (__vfs_write) from [] (vfs_write+0xac/0x170) [ 1462.184543] r9:dee0e000 r8:c000fc24 r7:dee0ff80 r6:b6f092d0 r5:df727a80 r4:00000002 [ 1462.192319] [] (vfs_write) from [] (SyS_write+0x4c/0xa8) [ 1462.199396] r9:dee0e000 r8:c000fc24 r7:00000002 r6:b6f092d0 r5:df727a80 r4:df727a80 [ 1462.207174] [] (SyS_write) from [] (ret_fast_syscall+0x0/0x3c) [ 1462.214774] r7:00000004 r6:ffffffff r5:00000000 r4:00000000 [ 1462.220456] Code: bad PC value [ 1462.223560] ---[ end trace 676638a3a12c7a56 ]--- Reported-by: Ralph Sennhauser Signed-off-by: Rafał Miłecki Fixes: 0f247626cbb ("usb: core: Introduce a USB port LED trigger") Signed-off-by: Greg Kroah-Hartman commit 32a35351b7ec93947efb394f252d9ad714a04f18 Author: Mathias Nyman Date: Thu Nov 17 11:14:14 2016 +0200 usb: hub: Fix auto-remount of safely removed or ejected USB-3 devices commit 37be66767e3cae4fd16e064d8bb7f9f72bf5c045 upstream. USB-3 does not have any link state that will avoid negotiating a connection with a plugged-in cable but will signal the host when the cable is unplugged. For USB-3 we used to first set the link to Disabled, then to RxDdetect to be able to detect cable connects or disconnects. But in RxDetect the connected device is detected again and eventually enabled. Instead set the link into U3 and disable remote wakeups for the device. This is what Windows does, and what Alan Stern suggested. Cc: Alan Stern Acked-by: Alan Stern Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman commit 3666b6280351c88592f7d767052d948fc81aeb18 Author: Felipe Balbi Date: Thu Sep 22 11:01:01 2016 +0300 usb: dwc3: gadget: set PCM1 field of isochronous-first TRBs commit 6b9018d4c1e5c958625be94a160a5984351d4632 upstream. In case of High-Speed, High-Bandwidth endpoints, we need to tell DWC3 that we have more than one packet per interval. We do that by setting PCM1 field of Isochronous-First TRB. Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 20d7c1a68b5b9fc7fe94c543a9920d9d7234017c Author: Nathaniel Quillin Date: Mon Dec 5 06:53:00 2016 -0800 USB: cdc-acm: add device id for GW Instek AFG-125 commit 301216044e4c27d5a7323c1fa766266fad00db5e upstream. Add device-id entry for GW Instek AFG-125, which has a byte swapped bInterfaceSubClass (0x20). Signed-off-by: Nathaniel Quillin Acked-by: Oliver Neukum Signed-off-by: Greg Kroah-Hartman commit c094cd32b0c714cc94608ca8c706261d50bc43e5 Author: Johan Hovold Date: Tue Nov 29 16:55:01 2016 +0100 USB: serial: kl5kusb105: fix open error path commit 6774d5f53271d5f60464f824748995b71da401ab upstream. Kill urbs and disable read before returning from open on failure to retrieve the line state. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 5e7c90bd53c61f878eb18b02f388d54ba47eca83 Author: Giuseppe Lippolis Date: Tue Dec 6 21:24:19 2016 +0100 USB: serial: option: add dlink dwm-158 commit d8a12b7117b42fd708f1e908498350232bdbd5ff upstream. Adding registration for 3G modem DWM-158 in usb-serial-option Signed-off-by: Giuseppe Lippolis Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 142513d6dc7ca378d90537abcc341d65dfaf438e Author: Daniele Palmas Date: Thu Dec 1 16:38:39 2016 +0100 USB: serial: option: add support for Telit LE922A PIDs 0x1040, 0x1041 commit 5b09eff0c379002527ad72ea5ea38f25da8a8650 upstream. This patch adds support for PIDs 0x1040, 0x1041 of Telit LE922A. Since the interface positions are the same than the ones used for other Telit compositions, previous defined blacklists are used. Signed-off-by: Daniele Palmas Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 1a5ec7dd17a98547618ea7b0ef3bbeb3837a7c49 Author: Filipe Manana Date: Thu Nov 24 02:09:04 2016 +0000 Btrfs: fix qgroup rescan worker initialization commit 8d9eddad19467b008e0c881bc3133d7da94b7ec1 upstream. We were setting the qgroup_rescan_running flag to true only after the rescan worker started (which is a task run by a queue). So if a user space task starts a rescan and immediately after asks to wait for the rescan worker to finish, this second call might happen before the rescan worker task starts running, in which case the rescan wait ioctl returns immediatley, not waiting for the rescan worker to finish. This was making the fstest btrfs/022 fail very often. Fixes: d2c609b834d6 (btrfs: properly track when rescan worker is running) Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit a1e0e0476afb2471f9cee0a995206b053a8241f8 Author: Filipe Manana Date: Wed Nov 23 16:21:18 2016 +0000 Btrfs: fix emptiness check for dirtied extent buffers at check_leaf() commit f177d73949bf758542ca15a1c1945bd2e802cc65 upstream. We can not simply use the owner field from an extent buffer's header to get the id of the respective tree when the extent buffer is from a relocation tree. When we create the root for a relocation tree we leave (on purpose) the owner field with the same value as the subvolume's tree root (we do this at ctree.c:btrfs_copy_root()). So we must ignore extent buffers from relocation trees, which have the BTRFS_HEADER_FLAG_RELOC flag set, because otherwise we will always consider the extent buffer as not being the root of the tree (the root of original subvolume tree is always different from the root of the respective relocation tree). This lead to assertion failures when running with the integrity checker enabled (CONFIG_BTRFS_FS_CHECK_INTEGRITY=y) such as the following: [ 643.393409] BTRFS critical (device sdg): corrupt leaf, non-root leaf's nritems is 0: block=38506496, root=260, slot=0 [ 643.397609] BTRFS info (device sdg): leaf 38506496 total ptrs 0 free space 3995 [ 643.407075] assertion failed: 0, file: fs/btrfs/disk-io.c, line: 4078 [ 643.408425] ------------[ cut here ]------------ [ 643.409112] kernel BUG at fs/btrfs/ctree.h:3419! [ 643.409773] invalid opcode: 0000 [#1] PREEMPT SMP [ 643.410447] Modules linked in: dm_flakey dm_mod crc32c_generic btrfs xor raid6_pq ppdev psmouse acpi_cpufreq parport_pc evdev parport tpm_tis tpm_tis_core pcspkr serio_raw i2c_piix4 sg tpm i2c_core button processor loop autofs4 ext4 crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring scsi_mod virtio e1000 floppy [ 643.414356] CPU: 11 PID: 32726 Comm: btrfs Not tainted 4.8.0-rc8-btrfs-next-35+ #1 [ 643.414356] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 643.414356] task: ffff880145e95b00 task.stack: ffff88014826c000 [ 643.414356] RIP: 0010:[] [] assfail.constprop.41+0x1c/0x1e [btrfs] [ 643.414356] RSP: 0018:ffff88014826fa28 EFLAGS: 00010292 [ 643.414356] RAX: 0000000000000039 RBX: ffff88014e2d7c38 RCX: 0000000000000001 [ 643.414356] RDX: ffff88023f4d2f58 RSI: ffffffff81806c63 RDI: 00000000ffffffff [ 643.414356] RBP: ffff88014826fa28 R08: 0000000000000001 R09: 0000000000000000 [ 643.414356] R10: ffff88014826f918 R11: ffffffff82f3c5ed R12: ffff880172910000 [ 643.414356] R13: ffff880233992230 R14: ffff8801a68a3310 R15: fffffffffffffff8 [ 643.414356] FS: 00007f9ca305e8c0(0000) GS:ffff88023f4c0000(0000) knlGS:0000000000000000 [ 643.414356] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 643.414356] CR2: 00007f9ca3071000 CR3: 000000015d01b000 CR4: 00000000000006e0 [ 643.414356] Stack: [ 643.414356] ffff88014826fa50 ffffffffa02d655a 000000000000000a ffff88014e2d7c38 [ 643.414356] 0000000000000000 ffff88014826faa8 ffffffffa02b72f3 ffff88014826fab8 [ 643.414356] 00ffffffa03228e4 0000000000000000 0000000000000000 ffff8801bbd4e000 [ 643.414356] Call Trace: [ 643.414356] [] btrfs_mark_buffer_dirty+0xdf/0xe5 [btrfs] [ 643.414356] [] btrfs_copy_root+0x18a/0x1d1 [btrfs] [ 643.414356] [] create_reloc_root+0x72/0x1ba [btrfs] [ 643.414356] [] btrfs_init_reloc_root+0x7b/0xa7 [btrfs] [ 643.414356] [] record_root_in_trans+0xdf/0xed [btrfs] [ 643.414356] [] btrfs_record_root_in_trans+0x50/0x6a [btrfs] [ 643.414356] [] create_subvol+0x472/0x773 [btrfs] [ 643.414356] [] btrfs_mksubvol+0x3da/0x463 [btrfs] [ 643.414356] [] ? btrfs_mksubvol+0x3da/0x463 [btrfs] [ 643.414356] [] ? preempt_count_add+0x65/0x68 [ 643.414356] [] ? __mnt_want_write+0x62/0x77 [ 643.414356] [] btrfs_ioctl_snap_create_transid+0xce/0x187 [btrfs] [ 643.414356] [] btrfs_ioctl_snap_create+0x67/0x81 [btrfs] [ 643.414356] [] btrfs_ioctl+0x508/0x20dd [btrfs] [ 643.414356] [] ? __this_cpu_preempt_check+0x13/0x15 [ 643.414356] [] ? handle_mm_fault+0x976/0x9ab [ 643.414356] [] ? arch_local_irq_save+0x9/0xc [ 643.414356] [] vfs_ioctl+0x18/0x34 [ 643.414356] [] do_vfs_ioctl+0x581/0x600 [ 643.414356] [] ? entry_SYSCALL_64_fastpath+0x5/0xa8 [ 643.414356] [] ? trace_hardirqs_on_caller+0x17b/0x197 [ 643.414356] [] SyS_ioctl+0x57/0x79 [ 643.414356] [] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 643.414356] [] ? trace_hardirqs_off_caller+0x3f/0xaa [ 643.414356] Code: 89 83 88 00 00 00 31 c0 5b 41 5c 41 5d 5d c3 55 89 f1 48 c7 c2 98 bc 35 a0 48 89 fe 48 c7 c7 05 be 35 a0 48 89 e5 e8 13 46 dd e0 <0f> 0b 55 89 f1 48 c7 c2 9f d3 35 a0 48 89 fe 48 c7 c7 7a d5 35 [ 643.414356] RIP [] assfail.constprop.41+0x1c/0x1e [btrfs] [ 643.414356] RSP [ 643.468267] ---[ end trace 6a1b3fb1a9d7d6e3 ]--- This can be easily reproduced by running xfstests with the integrity checker enabled. Fixes: 1ba98d086fe3 (Btrfs: detect corruption when non-root leaf has zero item) Signed-off-by: Filipe Manana Reviewed-by: Liu Bo Signed-off-by: Greg Kroah-Hartman commit c01ea880e88abedec7c4f02fcd9a110ce73c7ccb Author: David Sterba Date: Tue Nov 1 14:21:23 2016 +0100 btrfs: store and load values of stripes_min/stripes_max in balance status item commit ed0df618b1b06d7431ee4d985317fc5419a5d559 upstream. The balance status item contains currently known filter values, but the stripes filter was unintentionally not among them. This would mean, that interrupted and automatically restarted balance does not apply the stripe filters. Fixes: dee32d0ac3719ef8d640efaf0884111df444730f Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 01f285fe1d885dadfd7439205c700340ec69ab13 Author: Filipe Manana Date: Tue Nov 1 11:23:31 2016 +0000 Btrfs: fix relocation incorrectly dropping data references commit 054570a1dc94de20e7a612cddcc5a97db9c37b6f upstream. During relocation of a data block group we create a relocation tree for each fs/subvol tree by making a snapshot of each tree using btrfs_copy_root() and the tree's commit root, and then setting the last snapshot field for the fs/subvol tree's root to the value of the current transaction id minus 1. However this can lead to relocation later dropping references that it did not create if we have qgroups enabled, leaving the filesystem in an inconsistent state that keeps aborting transactions. Lets consider the following example to explain the problem, which requires qgroups to be enabled. We are relocating data block group Y, we have a subvolume with id 258 that has a root at level 1, that subvolume is used to store directory entries for snapshots and we are currently at transaction 3404. When committing transaction 3404, we have a pending snapshot and therefore we call btrfs_run_delayed_items() at transaction.c:create_pending_snapshot() in order to create its dentry at subvolume 258. This results in COWing leaf A from root 258 in order to add the dentry. Note that leaf A also contains file extent items referring to extents from some other block group X (we are currently relocating block group Y). Later on, still at create_pending_snapshot() we call qgroup_account_snapshot(), which switches the commit root for root 258 when it calls switch_commit_roots(), so now the COWed version of leaf A, lets call it leaf A', is accessible from the commit root of tree 258. At the end of qgroup_account_snapshot(), we call record_root_in_trans() with 258 as its argument, which results in btrfs_init_reloc_root() being called, which in turn calls relocation.c:create_reloc_root() in order to create a relocation tree associated to root 258, which results in assigning the value of 3403 (which is the current transaction id minus 1 = 3404 - 1) to the last_snapshot field of root 258. When creating the relocation tree root at ctree.c:btrfs_copy_root() we add a shared reference for leaf A', corresponding to the relocation tree's root, when we call btrfs_inc_ref() against the COWed root (a copy of the commit root from tree 258), which is at level 1. So at this point leaf A' has 2 references, one normal reference corresponding to root 258 and one shared reference corresponding to the root of the relocation tree. Transaction 3404 finishes its commit and transaction 3405 is started by relocation when calling merge_reloc_root() for the relocation tree associated to root 258. In the meanwhile leaf A' is COWed again, in response to some filesystem operation, when we are still at transaction 3405. However when we COW leaf A', at ctree.c:update_ref_for_cow(), we call btrfs_block_can_be_shared() in order to figure out if other trees refer to the leaf and if any such trees exists, add a full back reference to leaf A' - but btrfs_block_can_be_shared() incorrectly returns false because the following condition is false: btrfs_header_generation(buf) <= btrfs_root_last_snapshot(&root->root_item) which evaluates to 3404 <= 3403. So after leaf A' is COWed, it stays with only one reference, corresponding to the shared reference we created when we called btrfs_copy_root() to create the relocation tree's root and btrfs_inc_ref() ends up not being called for leaf A' nor we end up setting the flag BTRFS_BLOCK_FLAG_FULL_BACKREF in leaf A'. This results in not adding shared references for the extents from block group X that leaf A' refers to with its file extent items. Later, after merging the relocation root we do a call to to btrfs_drop_snapshot() in order to delete the relocation tree. This ends up calling do_walk_down() when path->slots[1] points to leaf A', which results in calling btrfs_lookup_extent_info() to get the number of references for leaf A', which is 1 at this time (only the shared reference exists) and this value is stored at wc->refs[0]. After this walk_up_proc() is called when wc->level is 0 and path->nodes[0] corresponds to leaf A'. Because the current level is 0 and wc->refs[0] is 1, it does call btrfs_dec_ref() against leaf A', which results in removing the single references that the extents from block group X have which are associated to root 258 - the expectation was to have each of these extents with 2 references - one reference for root 258 and one shared reference related to the root of the relocation tree, and so we would drop only the shared reference (because leaf A' was supposed to have the flag BTRFS_BLOCK_FLAG_FULL_BACKREF set). This leaves the filesystem in an inconsistent state as we now have file extent items in a subvolume tree that point to extents from block group X without references in the extent tree. So later on when we try to decrement the references for these extents, for example due to a file unlink operation, truncate operation or overwriting ranges of a file, we fail because the expected references do not exist in the extent tree. This leads to warnings and transaction aborts like the following: [ 588.965795] ------------[ cut here ]------------ [ 588.965815] WARNING: CPU: 2 PID: 2479 at fs/btrfs/extent-tree.c:1625 lookup_inline_extent_backref+0x432/0x5b0 [btrfs] [ 588.965816] Modules linked in: af_packet iscsi_ibft iscsi_boot_sysfs xfs libcrc32c ppdev acpi_cpufreq button tpm_tis e1000 i2c_piix4 pcspkr parport_pc parport tpm qemu_fw_cfg joydev btrfs xor raid6_pq sr_mod cdrom ata_generic virtio_scsi ata_piix virtio_pci bochs_drm virtio_ring drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio ttm serio_raw drm floppy sg [ 588.965831] CPU: 2 PID: 2479 Comm: kworker/u8:7 Not tainted 4.7.3-3-default-fdm+ #1 [ 588.965832] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 588.965844] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] [ 588.965845] 0000000000000000 ffff8802263bfa28 ffffffff813af542 0000000000000000 [ 588.965847] 0000000000000000 ffff8802263bfa68 ffffffff81081e8b 0000065900000000 [ 588.965848] ffff8801db2af000 000000012bbe2000 0000000000000000 ffff880215703b48 [ 588.965849] Call Trace: [ 588.965852] [] dump_stack+0x63/0x81 [ 588.965854] [] __warn+0xcb/0xf0 [ 588.965855] [] warn_slowpath_null+0x1d/0x20 [ 588.965863] [] lookup_inline_extent_backref+0x432/0x5b0 [btrfs] [ 588.965865] [] ? trace_clock_local+0x10/0x30 [ 588.965867] [] ? rb_reserve_next_event+0x6f/0x460 [ 588.965875] [] insert_inline_extent_backref+0x55/0xd0 [btrfs] [ 588.965882] [] __btrfs_inc_extent_ref.isra.55+0x8f/0x240 [btrfs] [ 588.965890] [] __btrfs_run_delayed_refs+0x74a/0x1260 [btrfs] [ 588.965892] [] ? cpuacct_charge+0x86/0xa0 [ 588.965900] [] btrfs_run_delayed_refs+0x9f/0x2c0 [btrfs] [ 588.965908] [] delayed_ref_async_start+0x94/0xb0 [btrfs] [ 588.965918] [] btrfs_scrubparity_helper+0xca/0x350 [btrfs] [ 588.965928] [] btrfs_extent_refs_helper+0xe/0x10 [btrfs] [ 588.965930] [] process_one_work+0x1f3/0x4e0 [ 588.965931] [] worker_thread+0x48/0x4e0 [ 588.965932] [] ? process_one_work+0x4e0/0x4e0 [ 588.965934] [] kthread+0xc9/0xe0 [ 588.965936] [] ret_from_fork+0x1f/0x40 [ 588.965937] [] ? kthread_worker_fn+0x170/0x170 [ 588.965938] ---[ end trace 34e5232c933a1749 ]--- [ 588.966187] ------------[ cut here ]------------ [ 588.966196] WARNING: CPU: 2 PID: 2479 at fs/btrfs/extent-tree.c:2966 btrfs_run_delayed_refs+0x28c/0x2c0 [btrfs] [ 588.966196] BTRFS: Transaction aborted (error -5) [ 588.966197] Modules linked in: af_packet iscsi_ibft iscsi_boot_sysfs xfs libcrc32c ppdev acpi_cpufreq button tpm_tis e1000 i2c_piix4 pcspkr parport_pc parport tpm qemu_fw_cfg joydev btrfs xor raid6_pq sr_mod cdrom ata_generic virtio_scsi ata_piix virtio_pci bochs_drm virtio_ring drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio ttm serio_raw drm floppy sg [ 588.966206] CPU: 2 PID: 2479 Comm: kworker/u8:7 Tainted: G W 4.7.3-3-default-fdm+ #1 [ 588.966207] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 588.966217] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] [ 588.966217] 0000000000000000 ffff8802263bfc98 ffffffff813af542 ffff8802263bfce8 [ 588.966219] 0000000000000000 ffff8802263bfcd8 ffffffff81081e8b 00000b96345ee000 [ 588.966220] ffffffffa021ae1c ffff880215703b48 00000000000005fe ffff8802345ee000 [ 588.966221] Call Trace: [ 588.966223] [] dump_stack+0x63/0x81 [ 588.966224] [] __warn+0xcb/0xf0 [ 588.966225] [] warn_slowpath_fmt+0x4f/0x60 [ 588.966233] [] btrfs_run_delayed_refs+0x28c/0x2c0 [btrfs] [ 588.966241] [] delayed_ref_async_start+0x94/0xb0 [btrfs] [ 588.966250] [] btrfs_scrubparity_helper+0xca/0x350 [btrfs] [ 588.966259] [] btrfs_extent_refs_helper+0xe/0x10 [btrfs] [ 588.966260] [] process_one_work+0x1f3/0x4e0 [ 588.966261] [] worker_thread+0x48/0x4e0 [ 588.966263] [] ? process_one_work+0x4e0/0x4e0 [ 588.966264] [] kthread+0xc9/0xe0 [ 588.966265] [] ret_from_fork+0x1f/0x40 [ 588.966267] [] ? kthread_worker_fn+0x170/0x170 [ 588.966268] ---[ end trace 34e5232c933a174a ]--- [ 588.966269] BTRFS: error (device sda2) in btrfs_run_delayed_refs:2966: errno=-5 IO failure [ 588.966270] BTRFS info (device sda2): forced readonly This was happening often on openSUSE and SLE systems using btrfs as the root filesystem (with its default layout where multiple subvolumes are used) where balance happens in the background triggered by a cron job and snapshots are automatically created before/after package installations, upgrades and removals. The issue could be triggered simply by running the following loop on the first system boot post installation: while true; do zypper -n in nfs-kernel-server zypper -n rm nfs-kernel-server done (If we were fast enough and made that loop before the cron job triggered a balance operation and the balance finished) So fix by setting the last_snapshot field of the root to the value of the generation of its commit root. Like this btrfs_block_can_be_shared() behaves correctly for the case where the relocation root is created during a transaction commit and for the case where it's created before a transaction commit. Fixes: 6426c7ad697d (btrfs: qgroup: Fix qgroup accounting when creating snapshot) Signed-off-by: Filipe Manana Reviewed-by: Josef Bacik Signed-off-by: Greg Kroah-Hartman commit 26dc52465f0df557d154ab59f8567431d4876671 Author: Robbie Ko Date: Fri Oct 7 17:30:47 2016 +0800 Btrfs: fix tree search logic when replaying directory entry deletes commit 2a7bf53f577e49c43de4ffa7776056de26db65d9 upstream. If a log tree has a layout like the following: leaf N: ... item 240 key (282 DIR_LOG_ITEM 0) itemoff 8189 itemsize 8 dir log end 1275809046 leaf N + 1: item 0 key (282 DIR_LOG_ITEM 3936149215) itemoff 16275 itemsize 8 dir log end 18446744073709551615 ... When we pass the value 1275809046 + 1 as the parameter start_ret to the function tree-log.c:find_dir_range() (done by replay_dir_deletes()), we end up with path->slots[0] having the value 239 (points to the last item of leaf N, item 240). Because the dir log item in that position has an offset value smaller than *start_ret (1275809046 + 1) we need to move on to the next leaf, however the logic for that is wrong since it compares the current slot to the number of items in the leaf, which is smaller and therefore we don't lookup for the next leaf but instead we set the slot to point to an item that does not exist, at slot 240, and we later operate on that slot which has unexpected content or in the worst case can result in an invalid memory access (accessing beyond the last page of leaf N's extent buffer). So fix the logic that checks when we need to lookup at the next leaf by first incrementing the slot and only after to check if that slot is beyond the last item of the current leaf. Signed-off-by: Robbie Ko Reviewed-by: Filipe Manana Fixes: e02119d5a7b4 (Btrfs: Add a write ahead tree log to optimize synchronous operations) Signed-off-by: Filipe Manana [Modified changelog for clarity and correctness] Signed-off-by: Greg Kroah-Hartman commit 664b053c53638f9a7b9f129d65182268fb504bf0 Author: Robbie Ko Date: Fri Oct 28 10:48:26 2016 +0800 Btrfs: fix deadlock caused by fsync when logging directory entries commit ec125cfb7ae2157af3dd45dd8abe823e3e233eec upstream. While logging new directory entries, at tree-log.c:log_new_dir_dentries(), after we call btrfs_search_forward() we get a leaf with a read lock on it, and without unlocking that leaf we can end up calling btrfs_iget() to get an inode pointer. The later (btrfs_iget()) can end up doing a read-only search on the same tree again, if the inode is not in memory already, which ends up causing a deadlock if some other task in the meanwhile started a write search on the tree and is attempting to write lock the same leaf that btrfs_search_forward() locked while holding write locks on upper levels of the tree blocking the read search from btrfs_iget(). In this scenario we get a deadlock. So fix this by releasing the search path before calling btrfs_iget() at tree-log.c:log_new_dir_dentries(). Example trace of such deadlock: [ 4077.478852] kworker/u24:10 D ffff88107fc90640 0 14431 2 0x00000000 [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] [ 4077.494346] ffff880ffa56bad0 0000000000000046 0000000000009000 ffff880ffa56bfd8 [ 4077.502629] ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 ffff88101a5d6138 [ 4077.510915] ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 ffff881016ce21c0 [ 4077.519202] Call Trace: [ 4077.528752] [] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs] [ 4077.536049] [] ? wake_up_atomic_t+0x30/0x30 [ 4077.542574] [] ? btrfs_search_slot+0x79f/0xb10 [btrfs] [ 4077.550171] [] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs] [ 4077.558252] [] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs] [ 4077.566140] [] ? add_delayed_data_ref+0xe2/0x150 [btrfs] [ 4077.573928] [] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] [ 4077.582399] [] ? __set_extent_bit+0x4c0/0x5c0 [btrfs] [ 4077.589896] [] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] [ 4077.599632] [] ? start_transaction+0x8d/0x470 [btrfs] [ 4077.607134] [] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs] [ 4077.615329] [] ? process_one_work+0x142/0x3d0 [ 4077.622043] [] ? worker_thread+0x109/0x3b0 [ 4077.628459] [] ? manage_workers.isra.26+0x270/0x270 [ 4077.635759] [] ? kthread+0xaf/0xc0 [ 4077.641404] [] ? kthread_create_on_node+0x110/0x110 [ 4077.648696] [] ? ret_from_fork+0x58/0x90 [ 4077.654926] [] ? kthread_create_on_node+0x110/0x110 [ 4078.358087] kworker/u24:15 D ffff88107fcd0640 0 14436 2 0x00000000 [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] [ 4078.373574] ffff880ffa57fad0 0000000000000046 0000000000009000 ffff880ffa57ffd8 [ 4078.381864] ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 ffff88101a5d6138 [ 4078.390163] ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 ffff88103004d0a0 [ 4078.398466] Call Trace: [ 4078.408019] [] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs] [ 4078.415322] [] ? wake_up_atomic_t+0x30/0x30 [ 4078.421844] [] ? btrfs_search_slot+0x79f/0xb10 [btrfs] [ 4078.429438] [] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs] [ 4078.437518] [] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs] [ 4078.445404] [] ? add_delayed_data_ref+0xe2/0x150 [btrfs] [ 4078.453194] [] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] [ 4078.461663] [] ? __set_extent_bit+0x4c0/0x5c0 [btrfs] [ 4078.469161] [] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] [ 4078.478893] [] ? start_transaction+0x8d/0x470 [btrfs] [ 4078.486388] [] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs] [ 4078.494561] [] ? process_one_work+0x142/0x3d0 [ 4078.501278] [] ? pwq_activate_delayed_work+0x27/0x40 [ 4078.508673] [] ? worker_thread+0x109/0x3b0 [ 4078.515098] [] ? manage_workers.isra.26+0x270/0x270 [ 4078.522396] [] ? kthread+0xaf/0xc0 [ 4078.528032] [] ? kthread_create_on_node+0x110/0x110 [ 4078.535325] [] ? ret_from_fork+0x58/0x90 [ 4078.541552] [] ? kthread_create_on_node+0x110/0x110 [ 4079.355824] user-space-program D ffff88107fd30640 0 32020 1 0x00000000 [ 4079.363716] ffff880eae8eba10 0000000000000086 0000000000009000 ffff880eae8ebfd8 [ 4079.372003] ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 ffff88101a5d6138 [ 4079.380294] ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 ffff881016c162c0 [ 4079.388586] Call Trace: [ 4079.398134] [] ? btrfs_tree_lock+0x85/0x2f0 [btrfs] [ 4079.405431] [] ? wake_up_atomic_t+0x30/0x30 [ 4079.411955] [] ? btrfs_lock_root_node+0x2b/0x40 [btrfs] [ 4079.419644] [] ? btrfs_search_slot+0xa03/0xb10 [btrfs] [ 4079.427237] [] ? btrfs_buffer_uptodate+0x52/0x70 [btrfs] [ 4079.435041] [] ? generic_bin_search.constprop.38+0x80/0x190 [btrfs] [ 4079.443897] [] ? btrfs_insert_empty_items+0x74/0xd0 [btrfs] [ 4079.451975] [] ? copy_items+0x128/0x850 [btrfs] [ 4079.458890] [] ? btrfs_log_inode+0x629/0xbf3 [btrfs] [ 4079.466292] [] ? btrfs_log_inode_parent+0xc61/0xf30 [btrfs] [ 4079.474373] [] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs] [ 4079.482161] [] ? btrfs_sync_file+0x20d/0x330 [btrfs] [ 4079.489558] [] ? do_fsync+0x4c/0x80 [ 4079.495300] [] ? SyS_fdatasync+0xa/0x10 [ 4079.501422] [] ? system_call_fastpath+0x16/0x1b [ 4079.508334] user-space-program D ffff88107fc30640 0 32021 1 0x00000004 [ 4079.516226] ffff880eae8efbf8 0000000000000086 0000000000009000 ffff880eae8effd8 [ 4079.524513] ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 ffff88101a5d6138 [ 4079.532802] ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 ffff881030279610 [ 4079.541092] Call Trace: [ 4079.550642] [] ? btrfs_tree_lock+0x85/0x2f0 [btrfs] [ 4079.557941] [] ? wake_up_atomic_t+0x30/0x30 [ 4079.564463] [] ? btrfs_search_slot+0x79f/0xb10 [btrfs] [ 4079.572058] [] ? btrfs_truncate_inode_items+0x168/0xb90 [btrfs] [ 4079.580526] [] ? join_transaction.isra.15+0x1e/0x3a0 [btrfs] [ 4079.588701] [] ? start_transaction+0x8d/0x470 [btrfs] [ 4079.596196] [] ? block_rsv_add_bytes+0x16/0x50 [btrfs] [ 4079.603789] [] ? btrfs_truncate+0xe9/0x2e0 [btrfs] [ 4079.610994] [] ? btrfs_setattr+0x30b/0x410 [btrfs] [ 4079.618197] [] ? notify_change+0x1dc/0x680 [ 4079.624625] [] ? aa_path_perm+0xd4/0x160 [ 4079.630854] [] ? do_truncate+0x5b/0x90 [ 4079.636889] [] ? do_sys_ftruncate.constprop.15+0x10a/0x160 [ 4079.644869] [] ? SyS_fcntl+0x5b/0x570 [ 4079.650805] [] ? system_call_fastpath+0x16/0x1b [ 4080.410607] user-space-program D ffff88107fc70640 0 32028 12639 0x00000004 [ 4080.418489] ffff880eaeccbbe0 0000000000000086 0000000000009000 ffff880eaeccbfd8 [ 4080.426778] ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 ffff88101a5d6138 [ 4080.435067] ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 ffff880f317ef1e0 [ 4080.443353] Call Trace: [ 4080.452920] [] ? btrfs_tree_read_lock+0xdd/0x190 [btrfs] [ 4080.460703] [] ? wake_up_atomic_t+0x30/0x30 [ 4080.467225] [] ? btrfs_read_lock_root_node+0x2b/0x40 [btrfs] [ 4080.475400] [] ? btrfs_search_slot+0x801/0xb10 [btrfs] [ 4080.482994] [] ? btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs] [ 4080.491857] [] ? btrfs_lookup_inode+0x26/0x90 [btrfs] [ 4080.499353] [] ? kmem_cache_alloc+0xaf/0xc0 [ 4080.505879] [] ? btrfs_iget+0xd5/0x5d0 [btrfs] [ 4080.512696] [] ? btrfs_get_token_64+0x104/0x120 [btrfs] [ 4080.520387] [] ? btrfs_log_inode_parent+0xbdf/0xf30 [btrfs] [ 4080.528469] [] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs] [ 4080.536258] [] ? btrfs_sync_file+0x20d/0x330 [btrfs] [ 4080.543657] [] ? do_fsync+0x4c/0x80 [ 4080.549399] [] ? SyS_fdatasync+0xa/0x10 [ 4080.555534] [] ? system_call_fastpath+0x16/0x1b Signed-off-by: Robbie Ko Reviewed-by: Filipe Manana Fixes: 2f2ff0ee5e43 (Btrfs: fix metadata inconsistencies after directory fsync) Signed-off-by: Filipe Manana [Modified changelog for clarity and correctness] Signed-off-by: Greg Kroah-Hartman commit 7d470f04e36c6a23838e9bb384535b7c61fe372c Author: Liu Bo Date: Fri Sep 2 12:35:34 2016 -0700 Btrfs: fix BUG_ON in btrfs_mark_buffer_dirty commit ef85b25e982b5bba1530b936e283ef129f02ab9d upstream. This can only happen with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y. Commit 1ba98d0 ("Btrfs: detect corruption when non-root leaf has zero item") assumes that a leaf is its root when leaf->bytenr == btrfs_root_bytenr(root), however, we should not use btrfs_root_bytenr(root) since it's mainly got updated during committing transaction. So the check can fail when doing COW on this leaf while it is a root. This changes to use "if (leaf == btrfs_root_node(root))" instead, just like how we check whether leaf is a root in __btrfs_cow_block(). Fixes: 1ba98d086fe3 (Btrfs: detect corruption when non-root leaf has zero item) Reported-by: Jeff Mahoney Signed-off-by: Liu Bo Reviewed-by: Filipe Manana Signed-off-by: Greg Kroah-Hartman commit 3bac322e18c3c5f3fc1d77ab0db1fca2e3b7cb03 Author: Maxim Patlasov Date: Mon Dec 12 14:32:44 2016 -0800 btrfs: limit async_work allocation and worker func duration commit 2939e1a86f758b55cdba73e29397dd3d94df13bc upstream. Problem statement: unprivileged user who has read-write access to more than one btrfs subvolume may easily consume all kernel memory (eventually triggering oom-killer). Reproducer (./mkrmdir below essentially loops over mkdir/rmdir): [root@kteam1 ~]# cat prep.sh DEV=/dev/sdb mkfs.btrfs -f $DEV mount $DEV /mnt for i in `seq 1 16` do mkdir /mnt/$i btrfs subvolume create /mnt/SV_$i ID=`btrfs subvolume list /mnt |grep "SV_$i$" |cut -d ' ' -f 2` mount -t btrfs -o subvolid=$ID $DEV /mnt/$i chmod a+rwx /mnt/$i done [root@kteam1 ~]# sh prep.sh [maxim@kteam1 ~]$ for i in `seq 1 16`; do ./mkrmdir /mnt/$i 2000 2000 & done [root@kteam1 ~]# for i in `seq 1 4`; do grep "kmalloc-128" /proc/slabinfo | grep -v dma; sleep 60; done kmalloc-128 10144 10144 128 32 1 : tunables 0 0 0 : slabdata 317 317 0 kmalloc-128 9992352 9992352 128 32 1 : tunables 0 0 0 : slabdata 312261 312261 0 kmalloc-128 24226752 24226752 128 32 1 : tunables 0 0 0 : slabdata 757086 757086 0 kmalloc-128 42754240 42754240 128 32 1 : tunables 0 0 0 : slabdata 1336070 1336070 0 The huge numbers above come from insane number of async_work-s allocated and queued by btrfs_wq_run_delayed_node. The problem is caused by btrfs_wq_run_delayed_node() queuing more and more works if the number of delayed items is above BTRFS_DELAYED_BACKGROUND. The worker func (btrfs_async_run_delayed_root) processes at least BTRFS_DELAYED_BATCH items (if they are present in the list). So, the machinery works as expected while the list is almost empty. As soon as it is getting bigger, worker func starts to process more than one item at a time, it takes longer, and the chances to have async_works queued more than needed is getting higher. The problem above is worsened by another flaw of delayed-inode implementation: if async_work was queued in a throttling branch (number of items >= BTRFS_DELAYED_WRITEBACK), corresponding worker func won't quit until the number of items < BTRFS_DELAYED_BACKGROUND / 2. So, it is possible that the func occupies CPU infinitely (up to 30sec in my experiments): while the func is trying to drain the list, the user activity may add more and more items to the list. The patch fixes both problems in straightforward way: refuse queuing too many works in btrfs_wq_run_delayed_node and bail out of worker func if at least BTRFS_DELAYED_WRITEBACK items are processed. Changed in v2: remove support of thresh == NO_THRESHOLD. Signed-off-by: Maxim Patlasov Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 56eaecc8ecf398c7af3fa252eff6164102b0ace1 Author: Michal Hocko Date: Wed Dec 7 14:54:38 2016 +0100 hotplug: Make register and unregister notifier API symmetric commit 777c6e0daebb3fcefbbd6f620410a946b07ef6d0 upstream. Yu Zhao has noticed that __unregister_cpu_notifier only unregisters its notifiers when HOTPLUG_CPU=y while the registration might succeed even when HOTPLUG_CPU=n if MODULE is enabled. This means that e.g. zswap might keep a stale notifier on the list on the manual clean up during the pool tear down and thus corrupt the list. Resulting in the following [ 144.964346] BUG: unable to handle kernel paging request at ffff880658a2be78 [ 144.971337] IP: [] raw_notifier_chain_register+0x1b/0x40 [ 145.122628] Call Trace: [ 145.125086] [] __register_cpu_notifier+0x18/0x20 [ 145.131350] [] zswap_pool_create+0x273/0x400 [ 145.137268] [] __zswap_param_set+0x1fc/0x300 [ 145.143188] [] ? trace_hardirqs_on+0xd/0x10 [ 145.149018] [] ? kernel_param_lock+0x28/0x30 [ 145.154940] [] ? __might_fault+0x4f/0xa0 [ 145.160511] [] zswap_compressor_param_set+0x17/0x20 [ 145.167035] [] param_attr_store+0x5c/0xb0 [ 145.172694] [] module_attr_store+0x1d/0x30 [ 145.178443] [] sysfs_kf_write+0x4f/0x70 [ 145.183925] [] kernfs_fop_write+0x149/0x180 [ 145.189761] [] __vfs_write+0x18/0x40 [ 145.194982] [] vfs_write+0xb2/0x1a0 [ 145.200122] [] SyS_write+0x52/0xa0 [ 145.205177] [] entry_SYSCALL_64_fastpath+0x12/0x17 This can be even triggered manually by changing /sys/module/zswap/parameters/compressor multiple times. Fix this issue by making unregister APIs symmetric to the register so there are no surprises. Fixes: 47e627bc8c9a ("[PATCH] hotplug: Allow modules to use the cpu hotplug notifiers even if !CONFIG_HOTPLUG_CPU") Reported-and-tested-by: Yu Zhao Signed-off-by: Michal Hocko Cc: linux-mm@kvack.org Cc: Andrew Morton Cc: Dan Streetman Link: http://lkml.kernel.org/r/20161207135438.4310-1-mhocko@kernel.org Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman