Multiclass Classification
Learn how to apply multiclass classification in ML.NET.
A multiclass classification problem is when we need to put an item into one of multiple categories. As the following diagram demonstrates, a multiclass classification model accepts an input object and, based on the values of its features, assigns it to one of multiple categories:
In this lesson, we'll look at how to train multiclass classification models with ML.NET both via a CLI command and through code.
Building a multiclass classification model
The playground below has two tab-separated files containing software errors. These files represent a subset of errors from an open-source Microsoft library of ML datasets. Each file contains the following fields:
ID
: Represents a unique identifier of a record.Area
: Represents the problem area. This is what we'll use as the labels for our categories.Title
: Represents the error title.Description
: Represents a detailed description of an error.
ID Area Title Description 17 area-System.Xml Some XPath.XDocument tests are failing Some XPath.XDocument queries have different results than other XPath navigators. This might be an old behavior or newly introduced bug. Failing tests: build /p=IncludeTraits=ActiveIssue=17 20 area-System.Xml 2 XPath.XDocument tests fail because of lacking feature "XPath.XDocument navigator doesn't support MoveToId(string). Verify if this was ever supported. If it was, verify if we want to support it in the future. If it wasn't move the tests to a different file and remove them from XPath.XDocument.Tests project. Failing tests: NodeSetFunctionsTest2267 MatchesTest2352 <!--- @huboard:{""order"":20.0,""milestone_order"":20,""custom_state"":""""} --> " 22 area-System.Numerics Two Numerics Tests are failing only on our CI server Two of the tests in our System.Numerics.Vectors suite are failing only on our CI build server, and potentially only intermittently: Vector2NormalizeTest1 Vector4NormalizeTest2 Given that these are very similar to other tests which cover a similar edge-case (especially the Vector3 normalization tests, which aren't failing), we will need to investigate why these tests in particular are failing on our build server. This may have been a point-in-time issue as we brought up our build infrastructure, and may not re-surface again. 36 area-System.Numerics SIMD test failures on non-ENU configurations. After pulling both of @adamralph 's pull requests #31 and #32, I'm continuing to see test failures for SIMD on a DEU (German) test environment. Here's a representative error: d:\oss\corefx\src\System.Numerics.Vectors\tests\GenericVectorTests.cs(545): error : System.Numerics.Tests.GenericVe ctorTests.ToStringCurrencySByte: Assert.Equal() Failure\r\nPosition: First difference is at position 8\r\nExpected: <97,00 ?, -108,00 ?, 22,00 ?, 29,00 ?, 49,00 ?, 60,00 ?, 103,00 ?, 58,00 ?, -62,00 ?, -124,00 ?, -117,00 ?, 48,00 ?, 15,00 ?, -35,00 ?, -13,00 ?, -34,00 ?>\r\nActual: <97,00 ?. -108,00 ?. 22,00 ?. 29,00 ?. 49,00 ?. 60,00 ?. 103 ,00 ?. 58,00 ?. -62,00 ?. -124,00 ?. -117,00 ?. 48,00 ?. 15,00 ?. -35,00 ?. -13,00 ?. -34,00 ?> [D:\oss\corefx\bin\ tools\fxbuild.proj] Observe that expected separates elements with a comma, actual separates elements with a dot. 41 area-System.Numerics Quaternion operator overloads should be using the respective methods Quaternion declares a handful of methods to perform addition, subtraction and multiplication, and provides the respective overloads for these operations. However, instead of re-using the `Add`, `Multiply` etc. methods, the code is re-written in the operator overloads. The operators should be using their respective methods rather than re-declaring the same code. This is under the assumption that the JIT inlines the methods when they are used in the operator overloads. 49 area-Infrastructure Add Linux/Mac build script A `build.sh` should be added alongside `build.cmd` to build corefx on Linux/Mac. 50 area-System.Numerics Made Quarternion's operator overloads use their respective methods "Operator overloads of Quaternion now utilize their respective methods, and removed redundant ""this"" qualifiers in Quaternion constructor. " 52 area-Meta ReferenceSource repo license incorrect for individual files some of the files I looked up, it has header comment with Apache 2.0 license. which license should apply for those files? MIT or Apache 2.0 54 area-System.Xml "Remove always true ""if"" and unreachable code in System.Xml.Linq.XObject.SkipNotify method." "This: if (o.Annotations<XObjectChangeAnnotation>() != null) is always true because the above while can finish only in two conditions: when o != null or when o.annotations == null. The first condition will be catched by "" if (o == null)"" and if the second one is true, ""o.Annotations<XObjectChangeAnnotation>() != null"" will also always be true i think. " 55 area-System.Xml [Issue 54] Removed always-true if and unreachable code in XObject.cs Removed always-true if and unreachable code in XObject.cs 58 area-System.Xml System.Xml.sln fails to build on Mono, error CS0433 **I know that cross-platform support is coming later, I just thought it might make sense to document this here in case someone else tries the same** Building System.Xml.sln with xbuild on Mono doesn't work (the other solutions build fine), it throws the following errors: ``` Build FAILED. Errors: /home/alexander/dev/corefx/src/System.Xml.sln (default targets) -> (Build target) -> /home/alexander/dev/corefx/src/System.Xml.XPath.XDocument/System.Xml.XPath.XDocument.csproj (default targets) -> /usr/lib/mono/4.5/Microsoft.CSharp.targets (CoreCompile target) -> System/Xml/XPath/XAttributeExtensions.cs(10,56): error CS0433: The imported type `System.Xml.Linq.XAttribute' is defined multiple times System/Xml/XPath/XAttributeExtensions.cs(10,78): error CS0433: The imported type `System.Xml.Linq.XNamespace' is defined multiple times System/Xml/XPath/XDocumentExtensions.cs(22,61): error CS0433: The imported type `System.Xml.Linq.XNode' is defined multiple times System/Xml/XPath/XObjectExtensions.cs(10,23): error CS0433: The imported type `System.Xml.Linq.XContainer' is defined multiple times System/Xml/XPath/XObjectExtensions.cs(10,49): error CS0433: The imported type `System.Xml.Linq.XObject' is defined multiple times System/Xml/XPath/XNodeNavigator.cs(48,9): error CS0433: The imported type `System.Xml.Linq.XElement' is defined multiple times System/Xml/XPath/XNodeNavigator.cs(784,35): error CS0433: The imported type `System.Xml.Linq.XText' is defined multiple times ``` It looks like it runs into a conflict with the System.Xml.Linq library in the Mono GAC. Needs further investigation. 69 area-Infrastructure build.cmd does not build solution on HP laptop (when Platform=MCD is pre-set) "I've got HP laptop with windows 7. When I run build.cmd, I've got the error ""configuration is invalid"" ``` C:\Projects\dotnet\corefx>build.cmd C:\Projects\dotnet\corefx\src\System.Collections.Immutable.sln.metaproj : error MSB4126: указанная конфигурация решения ""Release|MCD"" недопустима. Укажите допустимую конфигурац ию решения с помощью свойств Configuration и Platform (например, MSBuild.exe Solution.sln /p:Configuration=Debug /p:Platform=""Any CPU"") или оставьте эти свойства пустыми, чтобы использовать конфигурацию решения по умолчанию. [C:\Projects\dotnet\corefx\src\System.Collections.Immutable.sln] ``` The reason of this error is that environment variable `Platform` is set to `MCD` on the laptop by default. I have to change build command in the build.cmd by adding `/p:Platform=""Any CPU""` to compile the project. Build script should check for allowed platforms and generate user-friendly error message, which says what to do to successfully compile the project when the platform is not supported. " 70 area-Meta Hello, World! sample "Feature request: please add ""Hello, World!"" sample, which shows how to use .NET Core with user applications. " 71 area-System.Numerics Behaviour of `Quaternion.CreateFromAxisAngle` when axis is not a unit vector "The resulting quaternion depends on the length of `axis`. This does not describe a rotation and thus does not match the documentation. There are a few ways to handle this issue: 1. Normalize `axis`, but this incurs a performance hit 2. Add a precondition that `axis.Length()` ≈ 1. But enforcing that pre-condition is about as expensive as 1), so it'd probably need to be a documentation-only precondition. Unfortunately this means that consumers will rely on the current behaviour even if it's undocumented, so I don't think that this is a good solution. 3. Document the current behaviour --- Just for convenience, the relevant source code: ``` /// <summary> /// Creates a Quaternion from a vector and an angle to rotate about the vector. /// </summary> /// <param name=""axis"">The vector to rotate around.</param> /// <param name=""angle"">The angle, in radians, to rotate around the vector.</param> /// <returns>The created Quaternion.</returns> public static Quaternion CreateFromAxisAngle(Vector3 axis, float angle) { Quaternion ans; float halfAngle = angle * 0.5f; float s = (float)Math.Sin(halfAngle); float c = (float)Math.Cos(halfAngle); ans.X = axis.X * s; ans.Y = axis.Y * s; ans.Z = axis.Z * s; ans.W = c; return ans; } ``` " 72 area-System.Numerics `Equals` with NaN values (IEEE vs. reflexivity) The built in floating point types compare `NaN` as unequal when using `==` and `!=` (following IEEE semantics) but compare it as equal when using the `Equals` method. Your floating point based types currently use IEEE semantics even for `Equals`. I suggest using the same behaviour as the built in types in your floating point based types like vectors or quaternions. The MSDN documentation of `Equals` contains an exception that allows `A.Equals(A)` to return false on floating point types, so you don't strictly violate its contract. But returning false still breaks hash tables and does not match the behaviour of the built in types, so I consider it a bad idea. This can be avoided by calling `Equals` on the members instead of `==` in the implementation of `Equals` but not in `==` and `!=`. For example with quaternion, replace ``` public bool Equals(Quaternion other) { return (X == other.X && Y == other.Y && Z == other.Z && W == other.W); } ``` with ``` public bool Equals(Quaternion other) { return (X.Equals(other.X) && Y.Equals(other.Y) && Z.Equals(other.Z) && W.Equals(other.W)); } ``` You might want to add tests that check that `==` and `!=` compare all the above cases as unequal, so that they match the IEEE specification. Replace: ``` // Counterintuitive result - IEEE rules for NaN comparison are weird! Assert.False(a.Equals(a)); Assert.False(b.Equals(b)); Assert.False(c.Equals(c)); Assert.False(d.Equals(d)); ``` with: ``` // Equals does not follow IEEE semantics since many consumers rely on equality being reflexive. // This includes collections like `Dictionary<K,V>` or `HashSet<T>` Assert.True(a.Equals(a)); Assert.True(b.Equals(b)); Assert.True(c.Equals(c)); Assert.True(d.Equals(d)); // Counterintuitive result - IEEE rules for NaN comparison are weird! Assert.False(a == a); Assert.False(b == b); Assert.False(c == c); Assert.False(d == d); Assert.True(a != a); Assert.True(b != b); Assert.True(c != c); Assert.True(d != d); ``` 77 area-System.Xml Some Xml encoding tests lost their encoding Some of Xml tests were testing problematic characters in different Encoding (like Russian characters). At some point few years back the encoding was lost while moving between different repos. Tests pass because .cs files lost their encoding too. Tests need to be rewritten since there is no trace of original copy 81 area-System.Xml Add test coverage for XPath and XDocument - Adding test coverage for XPath implementations (XPath.XPathDocument, XPath.XmlDocument, XPath.XDocument) - Adding some tests for XDocument (more coming later) 94 area-Meta Necessary bits for Mono.Posix I tried to build [Mono.Posix](https://github.com/mono/mono/tree/master/mcs/class/Mono.Posix) against aspnetcore50 and found a lot of API that mono uses to be missing from .NET Core most prominently in the Interop-domain (like CustomMarshaler). Please add Mono.Posix to your list of scenarios that can benefit from the API surface of .NET Core. 110 area-System.Xml Add async document/element loading for XLinq. Adds XElement.LoadAsync and XDocument.LoadAsync. Code from the sync versions has been largely lifted out so they can share an implementation as much as possible. 116 area-System.Numerics Quaternion and public fields Is there a reason that the Quaternion struct has public exposed fields versus read only properties? Seems to be against the general guidelines for structs and immutability. Why would you want to allow X, Y, Z and W to be set outside of the ctor? 118 area-System.Numerics Matrix4x4 - more useful public properties Forward, Backward, Up, Down, Left, Right public properties for Matrix4x4 119 area-System.Numerics Vector3 - more public static properties Up, Down, Left, Right, Forward, Backward vector public static properties 121 area-System.Numerics Matrix4x4 - more useful public properties Added Forward, Backward, Up, Down, Left, Right vectors public properties for Matrix4x4. Fix #118 129 area-System.Xml Add XmlReader System.Xml.XmlReader is currently missing. 136 area-System.Xml XPath tests with expressions resulting in +/- infinity fail in Windows 10 "The following XPath tests fail for me on Windows 10 Technical Preview, build 9879: StringFunctionsTest2457 StringFunctionsTest2458 The call to Convert.ChangeType in XPathTests.Common.Utils.XPathObject<T> is throwing a FormatException when the value passed is ""Infinity"" or ""-Infinity"". The tests pass for me on Windows 8.1. Each of these tests is part of 3 different test assemblies, resulting in 6 failures total: System.Xml.XPath.Tests.dll System.Xml.XPath.XDocument.Tests.dll System.Xml.XPath.XmlDocument.Tests.dll " 141 area-System.Numerics Added .ToArray() and .FromArray() for Vector types This commit adds Vector2.ToArray(), Vector2.FromArray(float[]), Vector3.ToArray(), Vector3.FromArray(float[]), Vector4.ToArray() and Vector4.FromArray(float[]). The FromArray method copies the array elements to a vector: the element at index 0 becomes X, the element at index 1 becomes Y and so on. The ToArray() method converts the vector to an array: it puts X at index 0, Y at index 1 and so on. These methods looked useful to me if you have an array but want to do vector operations. 142 area-System.Numerics Vector2 and Vector3 Projection The current source includes public static methods for reflection of a vector about a unit vector normal but the underlying vector projection is also useful by itself. Proposing: ``` C# public static Vector2 Project(Vector2 vector, Vector2 component) { ... } //... public static Vector3 Project(Vector3 vector, Vector3 component) { ... } ``` I'll wait for feed back before trying to submit a pull request. 144 area-Infrastructure All the linked files are in the root folder of the XML projects. When you reference files outside the project directory, such as the common directory in the XML code case, these links by default show up in the root of the project in VS which is very annoying and clutters the solution unnecessarily. We should add the appropriate <Link> metadata to these Compile items so that VS will group them correctly in the solution explorer. 149 area-System.Xml Wrong named tests in System.Xml.XmlDocument "- The files NodeChangingTests.cs, NodeChangedTests.cs, NodeInsertingTests.cs and NodeInsertedTests.cs contain test cases called ""RemoveEventHandler"". No test uses a ""Remove""-event. Maybe a Copy&Paste mistake from file NodeRemovedTests.cs. - The files NodeChangingTests.cs and NodeChangedTests.cs contain test cases called ""RemoveNode"" but testing a change of a node. " 156 area-System.Numerics Private static read only fields for Vectors Private static read only fields for One, Zero and Unit vectors on Vector2, Vector3 and Vector4. 157 area-System.Numerics Alternative versions of methods for Vectors and Matrices "Would be helpful to have alternative versions of methods with ""out"" parameter as result. Example: this ```C# public static Matrix CreateFromQuaternion(Quaternion quaternion) { ... } ``` and that ```C# public static void CreateFromQuaternion(ref Quaternion quaternion, out Matrix result) { ... } ``` What do you think guys? Do you want to add that? " 176 area-System.Linq System.Linq.Parallel Tests are Failing on AppVeyor Machines The following 5 tests are failing (some intermittently) on our AppVeyor CI builds. I haven't investigated deeply, but I suspect it has to do with the limited hardware that the VM's are equipped with. Some of the test explicitly test whether multiple threads are created, which might not even be the case in an underpowered VM. 181 area-System.Linq Removes unnecessary object allocations because of delegates "I introduced several singletons to store delegate instances to avoid unnecessary object allocation in generic methods. #### For example `ParallelEnumerable.PerformAggregation()` method **Before:** ``` csharp private static T PerformAggregation<T>(this ParallelQuery<T> source, Func<T, T, T> reduce, T seed, bool seedIsSpecified, bool throwIfEmpty, QueryAggregationOptions options) { Contract.Assert(source != null); Contract.Assert(reduce != null); Contract.Assert(options.IsValidQueryAggregationOption(), ""enum is out of range""); AssociativeAggregationOperator<T, T, T> op = new AssociativeAggregationOperator<T, T, T>( source, seed, null, seedIsSpecified, reduce, reduce, delegate (T obj) { return obj; }, throwIfEmpty, options); return op.Aggregate(); } ``` ``` IL_0000: nop IL_0001: ldarg.0 IL_0002: ldarg.2 IL_0003: ldnull IL_0004: ldarg.3 IL_0005: ldarg.1 IL_0006: ldarg.1 IL_0007: ldnull IL_0008: ldftn !!0 System.Linq.ParallelEnumerable::'<PerformAggregation>b__e'<!!T>(!!0) >>> IL_000e: newobj instance void class [System.Runtime]System.Func`2<!!T, !!T>::.ctor(object, native int) IL_0013: ldarg.s throwIfEmpty IL_0015: ldarg.s options IL_0017: newobj instance void class System.Linq.Parallel.AssociativeAggregationOperator`3<!!T, !!T, !!T>::.ctor(class [System.Runtime]System.Collections.Generic.IEnumerable`1<!0>, !1, class [System.Runtime]System.Func`1<!1>, bool, class [System.Runtime]System.Func`3<!1, !0, !1>, class [System.Runtime]System.Func`3<!1, !1, !1>, class [System.Runtime]System.Func`2<!1, !2>, bool, valuetype System.Linq.Parallel.QueryAggregationOptions) IL_001c: stloc.0 IL_001d: ldloc.0 IL_001e: callvirt instance !2 class System.Linq.Parallel.AssociativeAggregationOperator`3<!!T, !!T, !!T>::Aggregate() IL_0023: stloc.1 IL_0024: br.s IL_0026 IL_0026: ldloc.1 IL_0027: ret ``` **After:** ``` csharp private static T PerformAggregation<T>(this ParallelQuery<T> source, Func<T, T, T> reduce, T seed, bool seedIsSpecified, bool throwIfEmpty, QueryAggregationOptions options) { Contract.Assert(source != null); Contract.Assert(reduce != null); Contract.Assert(options.IsValidQueryAggregationOption(), ""enum is out of range""); AssociativeAggregationOperator<T, T, T> op = new AssociativeAggregationOperator<T, T, T>( source, seed, null, seedIsSpecified, reduce, reduce, IdentityFunction<T>.Instance, throwIfEmpty, options); return op.Aggregate(); } ``` ``` IL_0000: nop IL_0001: ldarg.0 IL_0002: ldarg.2 IL_0003: ldnull IL_0004: ldarg.3 IL_0005: ldarg.1 IL_0006: ldarg.1 >>> IL_0007: ldsfld class [System.Runtime]System.Func`2<!0, !0> class System.Linq.IdentityFunction`1<!!T>::Instance IL_000c: ldarg.s throwIfEmpty IL_000e: ldarg.s options IL_0010: newobj instance void class System.Linq.Parallel.AssociativeAggregationOperator`3<!!T, !!T, !!T>::.ctor(class [System.Runtime]System.Collections.Generic.IEnumerable`1<!0>, !1, class [System.Runtime]System.Func`1<!1>, bool, class [System.Runtime]System.Func`3<!1, !0, !1>, class [System.Runtime]System.Func`3<!1, !1, !1>, class [System.Runtime]System.Func`2<!1, !2>, bool, valuetype System.Linq.Parallel.QueryAggregationOptions) IL_0015: stloc.0 IL_0016: ldloc.0 IL_0017: callvirt instance !2 class System.Linq.Parallel.AssociativeAggregationOperator`3<!!T, !!T, !!T>::Aggregate() IL_001c: stloc.1 IL_001d: br.s IL_001f IL_001f: ldloc.1 IL_0020: ret ``` Where `IdentityFunction` looks like ``` csharp internal class IdentityFunction<T> { public static readonly Func<T, T> Instance = Function; private static T Function(T arg) { return arg; } } ``` " 187 area-System.IO Can the System.IO namespace be modified to manipulate device handles? "This assumes the `System.IO` and 'Microsoft.Win32.SafeHandles' namespace are being open sourced in CoreFX. If not close this ticket. Right now if you do the following: ``` var handle = File.Open(""\\\\.\\Global\\ProcmonDebugLogger"", FileMode.Append); ``` You get an exception **FileStream will not open Win32 devices such as disk partitions and tape drives. Avoid use of ""\.\"" in the path.** If you want raw device access, you need to use P/Invoke. I propose the following: ``` public sealed class Microsoft.Win32.SafeHandles.SafeDeviceHandle: SafeHandleZeroOrMinusOneIsInvalid { . . . } public static Microsoft.Win32.SafeHandles.SafeFileHandle System.IO.File.OpenHandle(string path) { public bool DeviceIoControl( . . .) { . . . } } public static Microsoft.Win32.SafeHandles.SafeDeviceHandle System.IO.File.OpenDeviceHandle(string path) { . . .} ``` My use case for this would be to remove the [PInvoke functons from ProceMonDebugOutput](https://github.com/Wintellect/ProcMonDebugOutput/blob/master/Source/Sysinternals.Debug/NativeMethods.cs) by @JohnWintellect. " 195 area-System.Xml [System.Xml.XPath] Few tests fail sometimes fail because of NRE Few tests are failing in the same place because of NRE. Stack trace (RegressionTestsTest557): E:\oss\corefx\bin\tools\tests.targets(17,5): error : XPathTests.FunctionalTests.MiscellaneousCases.RegressionTestsTests.RegressionTestsTest557: System.NullReferenceException : Object reference not set to an instance of an object. [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XmlDocumentEx.GetNamespaceXml(XmlDocument xmlDocument)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.DocumentXPathNavigator.MoveToFirstNamespace(XPathNamespaceScope scope)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.NamespaceQuery.Advance()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.Query.MoveNext()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.Query.get_Count()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.XPathSelectionIterator.get_Count()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.Common.Utils.XPathNodesetTest(String xml, String testExpression, XPathResult expected, XmlNamespaceManager namespaceManager, String startingNodePath)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.FunctionalTests.MiscellaneousCases.RegressionTestsTests.RegressionTestsTest557() [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : XPathTests.FunctionalTests.CoreFunctionLibrary.StringFunctionsTests.StringFunctionsTest2465: System.NullReferenceException : Object reference not set to an instance of an object. [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XmlDocumentEx.GetNamespaceXml(XmlDocument xmlDocument)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.DocumentXPathNavigator.MoveToNextNamespace(XPathNamespaceScope scope)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.NamespaceQuery.Advance()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.ForwardPositionQuery.Evaluate(XPathNodeIterator context)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.BaseAxisQuery.Evaluate(XPathNodeIterator nodeIterator)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.MergeFilterQuery.Evaluate(XPathNodeIterator nodeIterator)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XPath.XPathNavigator.Evaluate(XPathExpression expr, XPathNodeIterator context)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XPath.XPathNavigator.Evaluate(XPathExpression expr)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XPath.XPathNavigator.Select(XPathExpression expr)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.Common.Utils.CreateNavigator(String xml, String startingNodePath, XmlNamespaceManager namespaceManager)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.Common.Utils.XPathObject[T](String xml, String testExpression, XmlNamespaceManager namespaceManager, String startingNodePath)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.Common.Utils.XPathStringTest(String xml, String testExpression, Object expected, XmlNamespaceManager namespaceManager, String startingNodePath)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.FunctionalTests.CoreFunctionLibrary.StringFunctionsTests.StringFunctionsTest2465() [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : XPathTests.FunctionalTests.Location.Paths.Axes.ComplexExpressionsTests.ComplexExpressionsTest348: System.NullReferenceException : Object reference not set to an instance of an object. [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XmlDocumentEx.GetNamespaceXml(XmlDocument xmlDocument)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.DocumentXPathNavigator.MoveToNextNamespace(XPathNamespaceScope scope)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.NamespaceQuery.Advance()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.Query.MoveNext()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.Query.get_Count()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.XPathSelectionIterator.get_Count()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.Common.Utils.XPathNodesetTest(String xml, String testExpression, XPathResult expected, XmlNamespaceManager namespaceManager, String startingNodePath)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.FunctionalTests.Location.Paths.Axes.ComplexExpressionsTests.ComplexExpressionsTest348() [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : XPathTests.FunctionalTests.Location.Paths.Predicates.UsingPositionFunctionTests.UsingPositionFunctionTest681: System.NullReferenceException : Object reference not set to an instance of an object. [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XmlDocumentEx.GetNamespaceXml(XmlDocument xmlDocument)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.DocumentXPathNavigator.MoveToNextNamespace(XPathNamespaceScope scope)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.NamespaceQuery.Advance()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.MergeFilterQuery.Evaluate(XPathNodeIterator nodeIterator)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XPath.XPathNavigator.Evaluate(XPathExpression expr, XPathNodeIterator context)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XPath.XPathNavigator.Evaluate(XPathExpression expr)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XPath.XPathNavigator.Select(XPathExpression expr)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.Common.Utils.XPathNodesetTest(String xml, String testExpression, XPathResult expected, XmlNamespaceManager namespaceManager, String startingNodePath)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.FunctionalTests.Location.Paths.Predicates.UsingPositionFunctionTests.UsingPositionFunctionTest681() [E:\oss\corefx\bin\tools\fxbuild.proj] Stack trace (NodeSetFunctionsTest22100): E:\oss\corefx\bin\tools\tests.targets(17,5): error : XPathTests.FunctionalTests.CoreFunctionLibrary.NodeSetFunctionsTests.NodeSetFunctionsTest22100: System.NullReferenceException : Object reference not set to an instance of an object. [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XmlDocumentEx.GetNamespaceXml(XmlDocument xmlDocument)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.DocumentXPathNavigator.MoveToNextNamespace(XPathNamespaceScope scope)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.NamespaceQuery.Advance()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.FilterQuery.Advance()\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at MS.Internal.Xml.XPath.MergeFilterQuery.Evaluate(XPathNodeIterator nodeIterator)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XPath.XPathNavigator.Evaluate(XPathExpression expr, XPathNodeIterator context)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XPath.XPathNavigator.Evaluate(XPathExpression expr)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at System.Xml.XPath.XPathNavigator.Select(XPathExpression expr)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.Common.Utils.CreateNavigator(String xml, String startingNodePath, XmlNamespaceManager namespaceManager)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.Common.Utils.XPathObject[T](String xml, String testExpression, XmlNamespaceManager namespaceManager, String startingNodePath)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.Common.Utils.XPathStringTest(String xml, String testExpression, Object expected, XmlNamespaceManager namespaceManager, String startingNodePath)\r [E:\oss\corefx\bin\tools\fxbuild.proj] E:\oss\corefx\bin\tools\tests.targets(17,5): error : at XPathTests.FunctionalTests.CoreFunctionLibrary.NodeSetFunctionsTests.NodeSetFunctionsTest22100() [E:\oss\corefx\bin\tools\fxbuild.proj] 201 area-System.Numerics Vector2/Vector3 could implement operator overloads For the sake of intuitive use, Vector2 and Vector3's `Public operator methods` could be extended with actual operator overloads. This promotes intuitive usage where `var vec = vecA * vecB;` is assumed to be more programmer-friendly than `var vec = Vector3.Multiply(vecA, vecB);`. I'm looking forward to your thoughts. 218 area-System.Console Add tests for System.Console library 219 area-System.Runtime Consider adding API to return common Funcs and Actions without reallocating See pull request #181: In that case, caching was proposed for System.Linq.Parallel. This means that other libraries can't share the same cache (which can grow by the number of generic instantiations). Furthermore, since the actual call sites affected in PLINQ were already allocating considerably, it is doubtful that PLINQ would benefit from the optimization. We should study if we can design public API in the BCL in the same vein as the new `Array.Empty<T>().` _Before_ we do this, we should: 1. Gather data about how many places we have across the stack that would benefit from these shared allocations and find use cases in more compelling places than PLINQ. 2. Design the API. We can use this issue to discuss and evolve it in to a mini-spec. Here's an idea just to get that ball rolling on that. ``` C# namespace System { public static class CommonAction { public static Action Empty { get; } } public static class CommonFunc { public static Func<T, bool> AlwaysTrue<T>(); public static Func<T, T> Identity<T>(); public static Func<TArgument, TType> IsInstanceOf<TArgument, TType>(); public static Func<TArgument, TType> Cast<TArgument, TType>(); } } ``` 220 area-System.Console Fix /property:OS=Unix build of System.Console Some resource strings got lost in the move to GitHub. There was also an inconsistency with the available NuGet packages and where SafeHandle lives. 221 area-System.Numerics Fix vector tests warnings about obsolete Marshal.SizeOf The System.Numerics.Vectors tests were causing build warnings like the following: ``` C# Vector2Tests.cs(15,29): warning CS0618: 'System.Runtime.InteropServices.Marshal.SizeOf(System.Type)' is obsolete: 'SizeOf(Type) may be unavailable in future releases. Instead, use SizeOf<T>(). ``` This changes does exactly that. 224 area-System.Console Implement simple console foreground/background color scoping via IDisposable "Just a simple API evolution that allows the developer to ""scope"" his color updates. Can be used like that : ``` csharp static void Main(string[] args) { Console.WriteLine(""Default""); using (Console.UseBackgroundColor(ConsoleColor.Red)) { Console.WriteLine(""Background red""); using (Console.UseForegroundColor(ConsoleColor.Green)) { Console.WriteLine(""Foreground green""); using (Console.UseForegroundColor(ConsoleColor.Yellow)) { Console.WriteLine(""Foreground yellow""); } Console.WriteLine(""Foreground green""); using (Console.UseBackgroundColor(ConsoleColor.Blue)) { Console.WriteLine(""Background blue""); } Console.WriteLine(""Background red""); } } Console.WriteLine(""Default""); } ``` And outputs like that :  " 227 area-System.Text Added System.Text.RegularExpressions source This adds the product source code for the System.Text.RegularExpressions library. The tests will require the CoreCLR test runner to execute, so those will be added once the runner is available. 228 area-System.Xml Some tests are failing because of lacking System.String.Format overload build /p=IncludeTraits=ActiveIssue=228 Example tests Test Name: NameWithWhitespace Test FullName: XmlDocumentTests.XmlDocumentTests.CreateElementTests.NameWithWhitespace Test Source: e:\oss\corefx\src\System.Xml.XmlDocument\tests\XmlDocumentTests\CreateElementTests.cs : line 90 Test Outcome: Failed Test Duration: 0:00:00.091 Result Message: Assert.Throws() Failure Expected: typeof(System.Xml.XmlException) Actual: typeof(System.MissingMethodException): Method not found: 'System.String System.String.Format(System.IFormatProvider, System.String, System.Object)'. Result StackTrace: at System.Xml.XmlExceptionHelper.BuildCharExceptionArgs(Char invChar, Char nextChar) at System.Xml.XmlExceptionHelper.BuildCharExceptionArgs(String data, Int32 invCharIndex) in e:\oss\corefx\src\Common\src\System\Xml\ValidateNames.cs:line 349 at System.Xml.XmlDocument.CheckName(String name) in e:\oss\corefx\src\System.Xml.XmlDocument\src\System\Xml\Dom\XmlDocument.cs:line 110 at System.Xml.XmlElement..ctor(XmlName name, Boolean empty, XmlDocument doc) in e:\oss\corefx\src\System.Xml.XmlDocument\src\System\Xml\Dom\XmlElement.cs:line 22 at System.Xml.XmlDocument.CreateElement(String prefix, String localName, String namespaceURI) in e:\oss\corefx\src\System.Xml.XmlDocument\src\System\Xml\Dom\XmlDocument.cs:line 713 at System.Xml.XmlDocument.CreateElement(String name) in e:\oss\corefx\src\System.Xml.XmlDocument\src\System\Xml\Dom\XmlDocument.cs:line 492 at XmlDocumentTests.XmlDocumentTests.CreateElementTests.<>c__DisplayClass10.<NameWithWhitespace>b__c() in e:\oss\corefx\src\System.Xml.XmlDocument\tests\XmlDocumentTests\CreateElementTests.cs:line 92 230 area-Meta XML Documentation comments not included "I was looking through the recently added Regex code and noticed that a lot of XML documentation summaries are not there, but they do show up in the .NET Framework documentation. Are the comments stored in an external file? If so, isn't it better to include them in the code or provide the external files? Some people would like to compile the libraries themselves and they would be missing a lot of inline documentation. **Example** _.NET Framework_  _.NET Core_  <!--- @huboard:{""order"":303.0} --> " 231 area-System.Text Improve performance of Regex ctor and IsMatch "The Regex class maintains a cache of byte codes, which the Regex ctor indexes into using a key. It uses this seemingly innocuous line to create that key: ``` C# String key = ((int)options).ToString(NumberFormatInfo.InvariantInfo) + "":"" + cultureKey + "":"" + pattern; ``` This, however, has the unfortunate effect of allocating a string for the options, a string array for the five strings to be passed to the String.Concat call generated by the compiler, another string array allocation inside of Concat, and then the resulting string for the whole operation. The cost of those allocations is causing a non-trivial slowdown for repeated Regex.IsMatch calls for simple regular expressions, such as for a phone number (e.g. from the MSDN docs ""^\d{3}-\d{3}-\d{4}$""). This commit adds a new struct key type that just stores the constitutent options, cultureKey, and pattern, rather than creating a string to store them. That key is then what's stored in each entry in the cache. For repeated Regex.IsMatch calls for basic regular expressions like the phone number one previously mentioned, on my machine this improves throughput by ~35%, in large part due to an ~80% reduction in number of allocations, and (for this particular test case) an ~70% reduction in number of bytes allocated (it depends primarily on the length of the pattern and the length of the culture name). " 234 area-System.Numerics Some System.Numerics.Vector tests are disabled without comment build /p:IncludeTraits=ActiveIssue=234 235 area-System.Linq Some System.Linq.Parallel tests are disabled without comment build /p:IncludeTraits=ActiveIssue=235 240 area-System.Linq Some System.Linq.Parallel tests are super slow on CI Deadlock or super slow build /p:IncludeTraits=ActiveIssue=240 241 area-System.Linq Fix multiple projects for optimizations, debug/trace, etc. System.Linq.Parallel.csproj wasn't outputting symbols, and it didn't have optimizations enabled in Release. System.Numerics.Vectors.csproj was setting TRACE in Release builds but not in Debug builds. System.Text.RegularExpressions.csproj wasn't setting either DEBUG nor RELEASE constants in either Debug/Release builds, it didn't have optimizations enabled in Release, and it wasn't generating symbols in either. The XML libraries were't setting DEBUG/TRACE in either Debug/Release builds, weren't enabling optimizations in Release builds, and weren't generating symbols in either. DEBUG is important in almost all of these libraries, as they use Debug.\* for asserts and the like. Symbol generation is important for debugging. Optimizations are obviously important (but just to verify I saw significant performance wins in multiple libraries just from turning this on in release). TRACE isn't really important, as none of the libraries do release tracing, but I thought it good hygiene to enable it. 247 area-Meta Add Security transparency Since Code Access Security is a thing of the past with .NET Core, I wouldn't expect permissions or security attributes to be used (or even to exist). I didn't see permissions anywhere in the current code: they seem to have be completely removed from .NET Core as expected. However, [several committed files](https://github.com/dotnet/corefx/search?q=SecurityCritical+OR+SecuritySafeCritical&type=Code) still use `SecurityCriticalAttribute` or `SecuritySafeCriticalAttribute`. Is this a left-over from the .NET framework code? I expect so, since they mean nothing without permission checks. If this is intended, what are the guidelines about using them? 249 area-System.Collections Memory allocation documentation for ImmutableDictionary Keys and Values ## Background As described in #246 (and previously in #147), the `Keys` and `Values` properties of `ImmutableDictionary<TKey, TValue>` cannot be used without incurring memory allocations on the heap. However, a slight modification of user code provides _equivalent_ behavior without these allocations. With allocations: ``` csharp foreach (var key in dictionary.Keys) { ... } ``` Without allocations: ``` csharp foreach (var pair in dictionary) { var key = pair.Key; } ``` ## Suggestion Since updating the `Keys` and `Values` properties to eliminate the allocations would be a breaking change (#246), the only recourse is to update the documentation for these properties to describe steps a user can take to avoid allocations if these properties are used heavily within a particular application. ## Caveat Care must be taken to ensure that the documentation does not read in a way that suggests the `Keys` and `Values` properties should not be used. The properties behave properly and would not be even a measurable source of time spent in the application in nearly all cases. This notice is only meant for users making heavy use of these properties who are also actively working to reduce the memory allocations performed within an especially performance-sensitive section of code. 251 area-System.Text Fix DEBUG constant name It looks like DBG was being used instead of DEBUG at System.Text.RegularExpressions. Was this on purpose? 252 area-System.Text Fix some more XML docs I know that the documentation will be updated eventually, but in the meanwhile we can have a few less compiler warnings if you guys agree. 253 area-System.Text Fix GetGroup never returning EmptyGroup This is a very obvious bug: _captureMap probably used to be a hashtable and the code was not properly refactored. I took the opportunity to redo the method using the Dictionary equivalent logic. Not adding unit tests for this, because the unit test project seems to be missing, and I'm not sure if you guys are going to add it or not, so let me know how to proceed. 254 area-System.Text Improve Dictionary access Just spotted this quick dictionary access optimization. 255 area-System.Text Improve Dictionary access 262 area-System.Reflection System.Reflection.Metadata should provide easier signature reading API **EDIT** 12/3/2015 - Replaced entire description with the detailed proposal matching PR #4809 Today, System.Reflection.Metadata provides low-level access to ECMA-335 CLI metadata, but only provides signatures as blobs that must be parsed with direct knowledge of the format as described in the section II.23.2 of the CLI specification. There is a `BlobReader` for reading various elements out of blobs, but it is up to the caller to read the right things at the right positions. This was by-design as the library is designed to sit at the lowest level behind the scenes of higher level API such as Reflection proper, Roslyn, or CCI. The challenge with signatures is that they are variable-length and encode tree structures, and each higher level model that could sit on top of System.Reflection.Metadata will have different representations for the trees. We do not want to introduce an API to build a fully-formed tree that then has to be traversed and rewritten to match the actual use case. This proposal is therefore a middle ground between: 1. Here are the bytes and some helpers. Read the spec to decode. (Status quo) 2. Yet another high-level metadata API. (Out-of-scope for this layer) It works as follows: 1. The caller chooses an arbitrary representation, `TType`, for type symbols and implements `ISignatureTypeProvider<TType>`. (See full API spec below). 2. Signatures are parsed by recursive descent and the provider is called to create new type nodes: - _Give me the `TType` that represents this primitive_ - _Give me the `TType` that represents this TypeDefinition_ - _Give me the `TType` that represents an array of this other TType_ - _etc._ # Sample Usage Given a suitable `TypeSymbol` and `TypeSymbolProvider : ISignatureTypeProvider<TypeSymbol>`, here is code walking all of the `TypeSymbol`'s for every field, parameter, and return type: ``` C# using (var stream = File.OpenRead(pathToDll)) using (var peReader = new PEReader(stream)) { var reader = peReader.GetMetadataReader(); var provider = new TypeSymbolProvider(); foreach (TypeDefinitionHandle typeHandle in reader.TypeDefinitions) { TypeDefinition type = reader.GetTypeDefinition(typeHandle); foreach (FieldDefinitionHandle fieldHandle in type.GetFields()) { FieldDefinition field = reader.GetFieldDefinition(fieldHandle); TypeSymbol fieldType = field.DecodeSignature(provider); // ... } foreach (MethodDefinitionHandle methodHandle in type.GetMethods()) { MethodDefinition method = reader.GetMethodDefinition(methodHandle); MethodSignature<TypeSymbol> methodSig = method.DecodeSignature(provider); TypeSymbol returnType = methodSig.ReturnType; // ... foreach (TypeSymbol parameterType in methodSig.ParameterTypes) { // ... } } } } ``` # Full API ## Additions to existing types These provide convenience entry points. There are other use cases where you want to parse only part of a signature or a signature that you did not obtain from the metadata reader. For that, the `SignatureDecoder` ``` C# namespace System.Reflection.Metadata { public struct FieldDefinition { public TType DecodeSignature<TType>( ISignatureTypeProvider<TType> provider, SignatureDecoderOptions options=SignatureDecoderOptions.None); } public struct MemberReference { public TType DecodeFieldSignature<TType>( ISignatureTypeProvider<TType> provider, SignatureDecoderOptions options=SignatureDecoderOptions.None); public MethodSignature<TType> DecodeMethodSignature<TType>( ISignatureTypeProvider<TType> provider, SignatureDecoderOptions options=SignatureDecoderOptions.None); // MemberReferenceKind GetKind(); already exists } public struct MethodDefinition { public MethodSignature<TType> DecodeSignature<TType>( ISignatureTypeProvider<TType> provider, SignatureDecoderOptions options=SignatureDecoderOptions.None); } public struct MethodSpecification { public ImmutableArray<TType> DecodeSignature<TType>( ISignatureTypeProvider<TType> provider, SignatureDecoderOptions options=SignatureDecoderOptions.None); } public struct PropertyDefinition { public MethodSignature<TType> DecodeSignature<TType>( ISignatureTypeProvider<TType> provider, SignatureDecoderOptions options=SignatureDecoderOptions.None); } public struct StandaloneSignature { public ImmutableArray<TType> DecodeLocalSignature<TType>( ISignatureTypeProvider<TType> provider, SignatureDecoderOptions options=SignatureDecoderOptions.None); public MethodSignature<TType> DecodeMethodSignature<TType>( ISignatureTypeProvider<TType> provider, SignatureDecoderOptions options=SignatureDecoderOptions.None); public StandaloneSignatureKind GetKind(); } public enum StandaloneSignatureKind { LocalVariables = 1, Method = 0, } public struct TypeSpecification { public TType DecodeSignature<TType>( ISignatureTypeProvider<TType> provider, SignatureDecoderOptions options=SignatureDecoderOptions.None); } } ``` ## New types ``` C# namespace System.Reflection.Metadata.Decoding { public struct ArrayShape { public ArrayShape(int rank, ImmutableArray<int> sizes, ImmutableArray<int> lowerBounds); public ImmutableArray<int> LowerBounds { get; } public int Rank { get; } public ImmutableArray<int> Sizes { get; } } public interface IConstructedTypeProvider<TType> : ISZArrayTypeProvider<TType> { TType GetArrayType(TType elementType, ArrayShape shape); TType GetByReferenceType(TType elementType); TType GetGenericInstance(TType genericType, ImmutableArray<TType> typeArguments); TType GetPointerType(TType elementType); } public interface IPrimitiveTypeProvider<TType> { TType GetPrimitiveType(PrimitiveTypeCode typeCode); } public interface ITypeProvider<TType> { TType GetTypeFromDefinition( MetadataReader reader, TypeDefinitionHandle handle, SignatureTypeHandleCode code); TType GetTypeFromReference( MetadataReader reader, TypeReferenceHandle handle, SignatureTypeHandleCode code); } public interface ISignatureTypeProvider<TType> : IConstructedTypeProvider<TType>, IPrimitiveTypeProvider<TType>, ITypeProvider<TType> { TType GetFunctionPointerType(MethodSignature<TType> signature); TType GetGenericMethodParameter(int index); TType GetGenericTypeParameter(int index); TType GetModifiedType( MetadataReader reader, bool isRequired, EntityHandle modifierTypeHandle, TType unmodifiedType); TType GetPinnedType(TType elementType); } public interface ISZArrayTypeProvider<TType> { TType GetSZArrayType(TType elementType); } public struct MethodSignature<TType> { public MethodSignature( SignatureHeader header, TType returnType, int requiredParameterCount, int genericParameterCount, ImmutableArray<TType> parameterTypes); public int GenericParameterCount { get; } public SignatureHeader Header { get; } public ImmutableArray<TType> ParameterTypes { get; } public int RequiredParameterCount { get; } public TType ReturnType { get; } } public enum PrimitiveTypeCode : byte { Boolean = (byte)2, Byte = (byte)5, Char = (byte)3, Double = (byte)13, Int16 = (byte)6, Int32 = (byte)8, Int64 = (byte)10, IntPtr = (byte)24, Object = (byte)28, SByte = (byte)4, Single = (byte)12, String = (byte)14, TypedReference = (byte)22, UInt16 = (byte)7, UInt32 = (byte)9, UInt64 = (byte)11, UIntPtr = (byte)25, Void = (byte)1, } public struct SignatureDecoder<TType> { public SignatureDecoder( ISignatureTypeProvider<TType> provider, MetadataReader metadataReader=null, SignatureDecoderOptions options=SignatureDecoderOptions.None); public TType DecodeFieldSignature(ref BlobReader blobReader); public ImmutableArray<TType> DecodeLocalSignature(ref BlobReader blobReader); public MethodSignature<TType> DecodeMethodSignature(ref BlobReader blobReader); public ImmutableArray<TType> DecodeMethodSpecificationSignature(ref BlobReader blobReader); public TType DecodeType(ref BlobReader blobReader); } public enum SignatureDecoderOptions { DifferentiateClassAndValueTypes = 1, None = 0, } public enum SignatureTypeHandleCode : byte { Class = (byte)18, Unresolved = (byte)0, ValueType = (byte)17, } } ``` # Notes - The interface segregation of `ISignatureTypeProvider<T>` is for future `TypeNameParser` and `CustomAttributeDecoder` (still under development in dev/metadata branch), which share some but not all of the same requirements for a type provider as the `SignatureDecoder` 267 area-System.Text Remove superfluous internal access modifier I'm not sure I see the point in having the internal modifier on these fields, but if I'm missing something feel free to close. 271 area-System.Text Regex collections should implement generic collection interfaces "`CaptureCollection`, `GroupCollection`, and `MatchCollection` currently only implement the non-generic `ICollection` interface. These collections should implement the generic collection interfaces to better interoperate with more modern APIs, such as LINQ. Since these collections are already indexable, they should implement `IList<T>` and `IReadOnlyList<T>`, as well as the non-generic `IList` (to be consistent with the generic interfaces). ## Rationale and Usage This is certainly a nice-to-have, but it _is_ a long-standing [request](https://connect.microsoft.com/VisualStudio/feedback/details/469250/capturecollection-groupcollection-and-matchcollection-do-not-implement-icollection-t-and-ienumerable-t) that [developers](http://stackoverflow.com/q/7274422) [still](http://jimlynn.wordpress.com/2009/11/04/why-cant-i-use-linq-on-a-capturecollection/) [ask](http://stackoverflow.com/q/16379433) [about](http://stackoverflow.com/q/4108956). Implementing the generic interfaces will allow these collections to be used more easily with LINQ and interoperate better with more modern framework and library APIs. For example, to use these collections with LINQ right now you have to know about and remember to use `Enumerable.Cast<TSource>()` to cast the non-generic `IEnumerable` into an `IEnumerable<T>`: ``` c# var captures = from capture in match.Groups.Cast<Group>().Last().Captures.Cast<Capture>() select capture.Value; ``` With these changes you'd no longer have to do that: ``` c# var captures = from capture in match.Groups.Last().Captures select capture.Value; ``` Plus, in the above example, you'd get a performance improvement when using `Enumerable.Last<TSource>.()` as its implementation has a fast-path for collections that implement `IList<T>`. ## Proposed API ``` c# // DebuggerDisplay and DebuggerTypeProxy added [DebuggerDisplay(""Count = {Count}"")] [DebuggerTypeProxy(typeof(RegexCollectionDebuggerProxy<Capture>))] // Previously only implemented ICollection public class CaptureCollection : IList<Capture>, IReadOnlyList<Capture>, IList { // Existing members public int Count { get; } public Capture this[int i] { get; } public IEnumerator GetEnumerator(); object ICollection.SyncRoot { get; } bool ICollection.IsSynchronized { get; } void ICollection.CopyTo(Array array, int arrayIndex); // Proposed members public void CopyTo(Capture[] array, int arrayIndex); IEnumerator<Capture> IEnumerable<Capture>.GetEnumerator(); int IList<Capture>.IndexOf(Capture item); void IList<Capture>.Insert(int index, Capture item); void IList<Capture>.RemoveAt(int index); Capture IList<Capture>.this[int index] { get; set; } void ICollection<Capture>.Add(Capture item); void ICollection<Capture>.Clear(); bool ICollection<Capture>.Contains(Capture item); bool ICollection<Capture>.IsReadOnly { get; } bool ICollection<Capture>.Remove(Capture item); int IList.Add(object value); void IList.Clear(); bool IList.Contains(object value); int IList.IndexOf(object value); IList.Insert(int index, object value); bool IList.IsFixedSize { get; } bool IList.IsReadOnly { get; } void IList.Remove(object value); void IList.RemoveAt(int index); object IList.this[int index] { get; set; } } // DebuggerDisplay and DebuggerTypeProxy added [DebuggerDisplay(""Count = {Count}"")] [DebuggerTypeProxy(typeof(RegexCollectionDebuggerProxy<Group>))] // Previously only implemented ICollection public class GroupCollection : IList<Group>, IReadOnlyList<Group>, IList { // Existing members public int Count { get; } public Group this[int groupnum] { get; } public Group this[String groupname] { get; } public IEnumerator GetEnumerator(); object ICollection.SyncRoot { get; } bool ICollection.IsSynchronized { get; } void ICollection.CopyTo(Array array, int arrayIndex); // Proposed members public void CopyTo(Group[] array, int arrayIndex); IEnumerator<Group> IEnumerable<Group>.GetEnumerator(); int IList<Group>.IndexOf(Group item); void IList<Group>.Insert(int index, Group item); void IList<Group>.RemoveAt(int index); Group IList<Group>.this[int index] { get; set; } void ICollection<Group>.Add(Group item); void ICollection<Group>.Clear(); bool ICollection<Group>.Contains(Group item); bool ICollection<Group>.IsReadOnly { get; } bool ICollection<Group>.Remove(Group item); int IList.Add(object value); void IList.Clear(); bool IList.Contains(object value); int IList.IndexOf(object value); IList.Insert(int index, object value); bool IList.IsFixedSize { get; } bool IList.IsReadOnly { get; } void IList.Remove(object value); void IList.RemoveAt(int index); object IList.this[int index] { get; set; } } // DebuggerDisplay and DebuggerTypeProxy added [DebuggerDisplay(""Count = {Count}"")] [DebuggerTypeProxy(typeof(RegexCollectionDebuggerProxy<Match>))] // Previously only implemented ICollection public class MatchCollection : IList<Match>, IReadOnlyList<Match>, IList { // Existing members public int Count { get; } public virtual Match this[int i] { get; } public IEnumerator GetEnumerator(); object ICollection.SyncRoot { get; } bool ICollection.IsSynchronized { get; } void ICollection.CopyTo(Array array, int arrayIndex); // Proposed members public void CopyTo(Match[] array, int arrayIndex); IEnumerator<Match> IEnumerable<Match>.GetEnumerator(); int IList<Match>.IndexOf(Match item); void IList<Match>.Insert(int index, Match item); void IList<Match>.RemoveAt(int index); Match IList<Match>.this[int index] { get; set; } void ICollection<Match>.Add(Match item); void ICollection<Match>.Clear(); bool ICollection<Match>.Contains(Match item); bool ICollection<Match>.IsReadOnly { get; } bool ICollection<Match>.Remove(Match item); int IList.Add(object value); void IList.Clear(); bool IList.Contains(object value); int IList.IndexOf(object value); IList.Insert(int index, object value); bool IList.IsFixedSize { get; } bool IList.IsReadOnly { get; } void IList.Remove(object value); void IList.RemoveAt(int index); object IList.this[int index] { get; set; } } ``` ## Details - There was some discussion as to whether only the read-only interfaces should be implemented, or both the read-only and mutable interfaces. The consensus is to implement both the read-only and mutable interfaces. This is consistent with other collections in the framework. The mutable interfaces are implemented as read-only: mutable members are implemented explicitly and throw `NotSupportedException` (like `ReadOnlyCollection<T>`). - There was an open question as to whether the non-generic `IList` should be implemented as well. These collections are indexable and if `IList<T>` and `IReadOnlyList<T>` are being implemented, `IList` should be implemented as well. This does add several more members, but they are all implemented explicitly so they don't add any new public members to intellisense, and the implementations are very straightforward. - `ICollection<T>.CopyTo` is implemented implicitly (public). - All other new members are implemented explicitly (non-public): - Mutable members are implemented explicitly because these collections are read-only and the mutable members throw `NotSupportedException` (like `ReadOnlyCollection<T>`). - `IList` members are implemented explicitly to hide non-generic members from intellisense. - `IList<T>.IndexOf` and `ICollection<T>.Contains` are implemented explicitly because these methods aren't very useful for these collections and should not be visible in intellisense by default. They're not useful because an implementation using `EqualityComparer<T>.Default` (consistent with other collections) will search the collection using reference equality due to the fact that `Capture`, `Group`, and `Match` do not implement `IEquatable<T>` and do not override `Equals()` and `GetHashCode()`. Further, these types do not have public constructors -- they are created internally by the regex engine, making it very unlikely that you'd want to search for an item in a collection ""A"" that was obtained from collection ""B"". - `IEnumerable<T>.GetEnumerator()` must be implemented explicitly because the non-generic `IEnumerable.GetEnumerator()` is already implemented implicitly and we can't overload on return type. This also precludes returning a `struct` `Enumerator` (for better enumeration performance) because changing the return type of the existing method would be a binary breaking change. As a result, you'll still have to specify the type when using `foreach` (e.g. `foreach (Capture capture in captures)`); you won't be able to use `var` (e.g. `foreach (var capture in captures)`), unfortunately. ## Open Questions - Should `GroupCollection` implement `IDictionary<string, Group>`, `IReadOnlyDictionary<string, Group>`, and `IDictionary`? `GroupCollection` already has a string indexer. Is it worth implementing the dictionary interfaces as part of this? Personally, I'm leaning toward ""no"" because there isn't a compelling scenario for the dictionary interfaces, and they can always be added in the future when needed. ## Pull Request A PR with the proposed changes is available: #1756 ## Updates - Edited this description to make it more of a _speclet_, based on the discussion below and the proposed API Review process. - Some improvements based on feedback from @sharwell. - Fixed existing members. - Added `IList`. These collections are indexable and it would be strange if `IList<T>` and `IReadOnlyList<T>` were implemented alongside `ICollection` but without `IList`. - Added `DebuggerDisplay` and `DebuggerTypeProxy` attributes. - Made `ICollection<T>.CopyTo` implicit (public). " 277 area-System.Text Update regular expressions collections to implement IReadOnlyList<T> Fixes #271 I believe this is the least intrusive path to implementing the requested functionality. I considered the following items but decided to not include them with this pull request. - `GroupEnumerator` is internal, so it might be possible to change the type of `GroupEnumerator.Capture` from `Capture` to `Group`. This would make the explicit cast in the generic `Current` property unnecessary. - I did not update `CaptureCollection` to implement `IReadOnlyDictionary<string, Capture>`, primarily because it's a larger change with the addition of the `Keys` and `Values` properties (even if they are explicitly implemented). This should probably be reviewed independently from a straightforward implementation of `IReadOnlyList<T>`. 281 area-System.Diagnostics System.Diagnostics.FileVersionInfo different/incorrect behaviors compared to .NET Framework FileVersionInfo solely depends on System.IO.FileSystem (and all of its dependencies, including System.IO, System.Threading.Tasks, etc) just to do a File.Exists in GetVersionInfo. We should remove this and instead see if we can use the return value of GetFileVersionInfoSizeEx to mimic the same behavior. 283 area-System.Linq System.Linq.Parallel test assembly disabled as it takes too long on AppVeyor The System.Linq.Parallel test assembly is disabled because it takes 50 seconds to run on my development machine and lots and lots of time to run on AppVeyor. 284 area-System.Xml XPath AxesCombination tests are failing on debug builds While running with debug builds the following AxesCombination Tests are failing with the below assert: \src\Common\tests\System.Xml.XPath\FuncLocation\PathAxeCombinationsTests.cs AxesCombinationsTest2142 AxesCombinationsTest2143 AxesCombinationsTest2144 IsSamePosition() on custom navigator returns inconsistent results at XPathNavigator.ComparePosition(XPathNavigator nav) d:\github\corefx\src\System.Xml.XPath\src\System\Xml\XPath\XPathNavigator.cs(711) at Query.CompareNodes(XPathNavigator l, XPathNavigator r) d:\github\corefx\src\System.Xml.XPath\src\System\Xml\XPath\Internal\Query.cs(150) at Query.Insert(List`1 buffer, XPathNavigator nav) d:\github\corefx\src\System.Xml.XPath\src\System\Xml\XPath\Internal\Query.cs(109) at DocumentOrderQuery.Evaluate(XPathNodeIterator context) d:\github\corefx\src\System.Xml.XPath\src\System\Xml\XPath\Internal\DocumentOrderQuery.cs(20) at BaseAxisQuery.Evaluate(XPathNodeIterator nodeIterator) d:\github\corefx\src\System.Xml.XPath\src\System\Xml\XPath\Internal\BaseAxisQuery.cs(108) at XPathNavigator.Evaluate(XPathExpression expr, XPathNodeIterator context) d:\github\corefx\src\System.Xml.XPath\src\System\Xml\XPath\XPathNavigator.cs(802) at XPathNavigator.Evaluate(XPathExpression expr) d:\github\corefx\src\System.Xml.XPath\src\System\Xml\XPath\XPathNavigator.cs(784) at XPathNavigator.Select(XPathExpression expr) d:\github\corefx\src\System.Xml.XPath\src\System\Xml\XPath\XPathNavigator.cs(764) at Utils.XPathNodesetTest(String xml, String testExpression, XPathResult expected, XmlNamespaceManager namespaceManager, String startingNodePath) d:\github\corefx\src\Common\tests\System.Xml.XPath\Common\Utils.cs(128) at AxesCombinationsTests.AxesCombinationsTest2144() d:\github\corefx\src\Common\tests\System.Xml.XPath\FuncLocation\PathAxeCombinationsTests.cs(44979) 292 area-System.Text Add Regex Tests This adds our tests for the System.Text.RegularExpressions library. Currently, these tests (and almost all of the tests that we will be bringing out from our internal sources) are going to print a lot of things to the console, as that is the testing paradigm we follow internally. We will be coming up with a way to either redirect the test output, or make it configurable. 293 area-System.Diagnostics Add FileVersionInfo tests This adds all of our tests for the System.Diagnostics.FileVersionInfo library. Included also are the assemblies and executables that are used as test assets. 294 area-Meta Proposal for the API Review process Now that we’re on GitHub we also get requests for new APIs. The current API review process is designed around the assumption that it is internal-only. We need to rethink this process for an open source world. This issue represents a proposal how this could be handled. It’s not final – any feedback is highly appreciated. ## Process Goals The key goals are: - **Designed for GitHub**. In order to be sustainable and not be a hurdle for contributors the API review process must feel natural to folks familiar with GitHub. - **Efficiency**. Performing API reviews requires looping in a set of experts. We want to conduct API reviews in an agile fashion without randomizing the reviewers or community members. - **Transparency**. We can use the same process for both internal as well as external contributors. This allows contributors to benefit from the results of API reviews even if the implementer isn’t external. ## Overall Process GitHub is generally based around the pull-request model. The idea is that contributors perform their changes in their own fork and submit a pull request against our repository. For trivial code changes, such as typo fixes, we want folks to directly submit a pull request rather than opening an issue. However, for bug fixes or feature work, we want contributors to first start a discussion by creating an issue. For work that involves adding new APIs we'd like the issue to contain what we call a _speclet_. The speclet should provide a rough sketch of how the APIs are intended to be used, with sample code that shows typical scenarios. The goal isn't to be complete but rather to illustrate the direction so that readers can judge whether the proposal is sound.  ## Steps - **Contributor opens an issue**. The issue description should contain a speclet that represents a sketch of the new APIs, including samples on how the APIs are being used. The goal isn’t to get a complete API list, but a good handle on how the new APIs would roughly look like and in what scenarios they are being used. - **Community discusses the proposal**. If changes are necessary, the contributor is encouraged to edit the issue description. This allows folks joining later to understand the most recent proposal. To avoid confusion, the contributor should maintain a tiny change log, like a bolded “Updates:” followed by a bullet point list of the updates that were being made. - **Issue is tagged as “Accepting PRs”**. Once the contributor and project owner agree on the overall shape and direction, the project owner tags the issue as “Accepting PRs”. The contributor should indicate whether they will be providing the PR or only contributed the idea. - **Coding**. The contributor is implementing the APIs as discussed. Minor deviations are OK, but if during the implementation the design starts to take a major shift, the contributor is encouraged to go back to the issue and raise the concerns with the current proposal. - **Pull request is being created**. Once the contributor believes the implementation is ready for review, she creates a pull request, referencing the issue created in the first step. - **Pull request is being reviewed**. The community reviews the code for the pull request. The review should focus on the code changes and architecture – not the APIs themselves. Once at least two project owners give their OK, the PR is considered good to go. - **Pull is tagged as “Needs API Review”**. The project owner then marks the pull request as “Needs API Review”. - **API review**. Using the information in the pull request we’ll create an APIX file that constitutes the API delta. The API review board meets multiple times a week to review all PRs that are tagged as needing an API review. - **Pull request is updated with the results of the API Review**. Once the API review is complete, the project owner uploads the notes and API HTML diff, including all comments. The project owner also updates the PR accordingly, with either a call to action to address some concerns or a good to go indicator. - **Pull request is merged**. When there are no issues – or the issues were addressed by the contributor, the PR is merged. ## API Design Guidelines The .NET design guidelines are captured in the famous book [Framework Design Guidelines](http://amazon.com/dp/0321545613) by Krzysztof Cwalina and Brad Abrams. A digest with the most important guidelines are available in our [developer wiki](https://github.com/dotnet/corefx/wiki/Framework%20Design%20Guidelines%20Digest). Long term, we'd like to publish the individual guidelines in standalone repo on which we can also accept PRs and -- more importantly for API reviews -- link to. 295 area-System.Diagnostics Fix file name casing in System.Diagnostics.FileVersionInfo.csproj The file on disk is called Interop.manual.cs, not Interop.Manual.cs. This fixes compilation with Mono on Linux (or other case-sensitive FS). 298 area-System.Xml Use String.Equals instead of String.Compare for equality checks Per the [Best Practices for Using Strings in the .NET Framework](http://msdn.microsoft.com/en-us/library/dd465121%28v=vs.110%29.aspx): - Use an overload of the `String.Equals` method to test whether two strings are equal. - Use the `String.Compare` and `String.CompareTo` methods to sort strings, **not to check for equality**. 299 area-Meta Removing `readonly` from a field can be a breaking change Currently the [**Signatures** section of the Breaking Change Rules](https://github.com/dotnet/corefx/wiki/Breaking-Change-Rules#signatures) page includes the following: > ✓ Allowed > - Removing `readonly` from a field As hinted in [a recent tweet](https://twitter.com/samharwell/status/543970816321650688), this should probably be disallowed in a particular case. I would prefer not to give the answer away publicly yet, but feel free to contact me by email. Here is the original tweet: > C# Quiz: Add `readonly` to a field only set in the constructor, which changes the runtime behavior of the type (no use of reflection). 300 area-Meta Refactor System.Servicemodel.ClientBase to have non-generic base "The following is an example of a generic extension method I would like to write, but cannot because `ClientBase<T>` does not inherit from a base class containing the non generic members. It would be very helpful if `ClientBase<T>` was refactored into `ClientBase<T>:ClientBase` ``` C# using DotNetOpenAuth.OAuth2; using System; using System.Net; using System.ServiceModel; using System.ServiceModel.Channels; using System.Threading; using System.Threading.Tasks; namespace OAuthClient { public static class ExtensionMethods { public static async Task<T> CallAsync<TClient, TChannel, T>(this TClient wcfClient, Func<TClient, T> predicate, IAuthorizationState authorization, CancellationToken cancellationToken) where TClient : ClientBase<TChannel> where TChannel : class { if (authorization == null) { throw new InvalidOperationException(""No access token!""); } // Refresh the access token if it expires and if its lifetime is too short to be of use. if (authorization.AccessTokenExpirationUtc.HasValue) { await AuthorizationServer.Client.RefreshAuthorizationAsync(authorization, TimeSpan.FromSeconds(30)); } var httpRequest = (HttpWebRequest)WebRequest.Create(wcfClient.Endpoint.Address.Uri); ClientBase.AuthorizeRequest(httpRequest, authorization.AccessToken); var httpDetails = new HttpRequestMessageProperty(); httpDetails.Headers[HttpRequestHeader.Authorization] = httpRequest.Headers[HttpRequestHeader.Authorization]; using (var scope = new OperationContextScope(wcfClient.InnerChannel)) { OperationContext.Current.OutgoingMessageProperties[HttpRequestMessageProperty.Name] = httpDetails; bool isError = true; try { T result = predicate(wcfClient); // If there is an error on the channel, the close call will throw an exception. wcfClient.Close(); isError = false; return result; } finally { // If we have an error on the channel, we cannot close the channel so we abort. // Exceptions that occur will still be raised unmodified. if (isError) { wcfClient.Abort(); } } } } } } ``` " 301 area-System.Diagnostics FileVersionInfo Cleanup This addresses code review feedback from #274. /cc @mellinoe and @stephentoub 304 area-System.Text Regex should provide a validation method "`Regex` doesn't provide a way to validate whether a given string is a valid regular expression without getting an exception. It should provide one. ## Background `TryParse` is a common pattern in .NET which is useful when a potentially invalid string has to be converted to a different representation. As invalid user input is not an exceptional situation, it makes sense to avoid exception overhead (which coincidentally also educates new developers to handle those situations correctly). Regular expressions are not as common in external input as `int` or `DateTime`, however there are situations when regex is an appropriate tool for the power users (e.g. advanced text editor search/replace). If you want to validate the regex as it is being typed (with reasonable throttling), it is useful to have as little overhead as possible. Request for Regex.TryParse (not by me) had [5 votes on Microsoft Connect](http://webcache.googleusercontent.com/search?q=cache:KV6wcz2SJ44J:https://connect.microsoft.com/VisualStudio/feedback/details/331753/regex-should-have-a-static-tryparse-method+) and currently exists on [UserVoice with 3 votes](http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2683916-regex-should-have-a-static-tryparse-method). ## Design Options Originally I wanted to propose `Regex.TryCreate` (which is in line with `Uri.TryCreate` since `Regex` does no have a `Parse` method). However even for power users regexes might not be easy to write, and providing more information than just ""you have failed"" is useful. I can't think of any existing pattern in the BCL that fits this well, but basically something like this would be useful: ``` csharp public class Regex { // existing members… public static RegexParseResult Parse(string pattern); public static RegexParseResult Parse(string pattern, RegexOptions options); // or alternatively we could use a default value for options } public class RegexParseResult { public bool IsSuccess { get; } public Regex Regex { get; } // parser doesn't have to provide more than one, however it would allow it to do so in the future public IReadOnlyCollection<RegexParseError> Errors { get; } } public class RegexParseError { public string Message { get; } public int Start { get; } public int End { get; } } ``` ### API Concerns It would be useful to know if I am missing some existing common pattern here, so that consistency with other BCL APIs can be provided. I don't like using name `Parse` here either as it invites confusion -- however I can't currently think of a better approach. ## Example Usage ``` csharp public class RegexValidationRule : System.Windows.Controls.ValidationRule { public override ValidationResult Validate(object value, CultureInfo cultureInfo) { var valueString = (string)value; var parseResult = Regex.Parse(valueString); if (!parseResult.IsSuccess) { var errorString = string.Join(NewLine, parseResult.Errors.Select(e => $""{e.Message} at {e.Start}"")); return new ValidationResult(false, errorString); } return ValidationResult.ValidResult; } } ``` ## Compatibility I think adding a new static method should be safe enough. " 306 area-System.Diagnostics Make Process.Start have a option to change handle inheritance "Currently if you call Process.Start internally CreateProcess is called with bInheritHandles = true (hard coded). It would be great to make it possible to change this behavior, e.g. by adding a Property to ProcessStartInfo. Currently there is no way I know of to change this other then reimplementing System.Diagnostics.Process. # Example If you run this application twice without exiting the first notepad instance the second instance will not be able to open the tcp port, because notepad is still running. This can be a problem for server applications that are starting child processes themself and crash, or are killed by the user before the socket can be closed. ``` csharp using System.Diagnostics; using System.Net; using System.Net.Sockets; class Program { static void Main() { TcpListener listener = new TcpListener(IPAddress.Any, 4567); listener.Start(); Process.Start(new ProcessStartInfo(""notepad.exe"") { UseShellExecute = false }); //Simulate application crash without freeing resources } } ``` # Design proposal The easiest way to make this possible is to add a new Property to ProcessStartInfo and use this in the Call to CreateProcess ``` csharp public sealed class ProcessStartInfo { // ... public bool InheritHandles { get; set; } // defaults to true // ... } ``` # Questions - Is there a very important reason why this was hardcoded like this in the first place? " 307 area-Meta Is the corefx repo or the dotnet org a good place for key community projects? Forked from #294. from @n8ohu - > I am working on uses a C# wrapper around a cross-platform C library and it would, in my opinion, be beneficial to other developers if it was part of the core; There is really a spectrum of .NET libraries, with one end point being the official corefx distribution and the other a weekend project never to be touched again. What you are proposing is much closer to the former. I think your base question is whether corefx or a peer repo is appropriate for key community-owned projects. Your scenario is C# wrappers over native libraries, but it could also be a pure managed library, too. We are considering a similar plan for our own libraries that don't fit or don't fit yet in corefx. We can and should include community libraries in that thinking. The Mono org seems to embrace community libraries that have broad applicability. We should take some inspiration from that. It's always a fine idea to host a library out of your own GitHub account. At the same time, we are intending the dotnet org to be a center of gravity for .NET library development. That's why we put the corefx repo there (and same will be true for the coreclr one) instead of the Microsoft org. It's likely the case that discoverability will be better in the dotnet org than an arbitrary personal repo. What do you think? 311 area-System.Console Bring back Console.CancelKeyPress "We need this so that we can wait on Ctrl + C events in console applications. See https://github.com/aspnet/Hosting/blob/dev/src/Microsoft.AspNet.Hosting/Program.cs#L69. <!--- @huboard:{""order"":311.0,""milestone_order"":311,""custom_state"":""""} --> " 312 area-System.IO FileSystem cross platformness and testability Building cross platform applications that talk to the file system is painful. There are several issues like case sensitivity, slash differences and other bizarre restrictions and differences. On another front, testability of the file system has always been impossible with .NET. Any application/library that uses the FileSystem uses something like https://www.nuget.org/packages/System.IO.Abstractions or a custom abstraction to allow mockability. Are there any plans to improve the System.IO API to solve some of these problems? 313 area-System.Numerics System.Numerics.Vectors Fails to Load W/ .NET 4.6 Preview Installed "``` An unhandled exception of type 'System.IO.FileLoadException' occurred in Unknown Module. Additional information: Could not load file or assembly 'System.Numerics.Vectors, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040) ``` Steps to Reproduce. - Create a console application. - Add the nuget package for System.Numerics.Vector to the project (I tried 1.1.5-beta and 1.1.6-beta). - Add some dependent code (I used `Console.WriteLine(""Hardware Accelerated Vectors: {0}"", Vector.IsHardwareAccelerated);`) - Run the project. - Get above exception. It looks like somehow VS is getting the version number pinned at 4.0.0 and rejects the load. Specific version is disabled, the DLL is being copied as expected and no relevant entries appear to be in the app.config. " 316 area-System.Text Implement IList<T>, IReadOnlyList<T>, and IList on Regex Collections Fixes #271 This is still a work-in-progress. ### TODO - [x] Implement `IList<T>`, `IReadOnlyList<T>`, and `IList` - [x] Add `RegexCollectionDebuggerProxy` - [x] API review - [x] Changes based on API review and initial code review feedback - [ ] Cleanup - [ ] Add tests - [ ] Final code review ### Questions/Notes - As @sharwell noted in #277: `GroupEnumerator` is internal, so it might be possible to change the type of `GroupEnumerator.Capture` from `Capture` to `Group`. This would make the explicit cast in the generic `Current` property unnecessary. 317 area-System.Text Simplify MatchCollection's ICollection.CopyTo - Uses [`List<T>`'s implementation of `ICollection.CopyTo`](https://github.com/Microsoft/referencesource/blob/9da503f9ef21e8d1f2905c78d4e3e5cbb3d6f85a/mscorlib/system/collections/generic/list.cs#L394-L407) which is behavior compatible with the previous implementation. - Avoids an unnecessary array allocation by no longer using `List<T>.ToArray()`. - `SR.Arg_RankMultiDimNotSupported` and `SR.Arg_InvalidArrayType` are no longer needed. - Avoids accessing a property to trigger a side effect. 321 area-System.Text Regex capture, match and group collections now implement the generic IEnumerable This is my first contribution to the .NET Core. I would like to know whether this change would be acceptable under the guidelines of the .NET framework. I am asking because I am making classes implement an interface (acceptable by your rules), but I am also changing the existing interface implementation to an explicit one, in favor of the implicit implementation being the new interface (which is a really grey area that is not included in the rules). Please guide me as to how this pull request needs to change to become acceptable. 322 area-Meta Branching (or not) While commenting on [an outstanding pull request](https://github.com/dotnet/corefx/pull/110#issuecomment-65910391), David Kean mentioned: > We're trying to figure out our branching structure at the moment. Currently we have only a single branch that represents what we're shipping for the next update of .NET Core. Clearly this isn't scalable for the product, and we're going to creating branches for future work so PR's that have API changes and destablizing changes get a little more bake time. @joshfree is looking into that. I would like to get the conversation about branching structure out into the open. And to do that, I would like to start by asking: _why_ is a single branch not scalable for the product? I have worked in code bases of various sizes, release schedules, and branching schemes, and my experience is that NOT branching is nearly always preferable to branching. In fact, the most common reasons given for wanting to branch (e.g. disruptive or long-running changes) are exactly the times when branching hurts the most, because it encourages divergence between different works-in-progress, pushing difficult integration decisions farther out and requiring more rework later. Rather than a policy that handles change by branching, I would rather see a policy of trunk-based development that encourages all changes - even incomplete changes - to be merged into master on a frequent basis. This shortens and simplifies the merging process and encourages small refactorings and improvements by reducing the amount of work needed to merge them. Obviously allowing unfinished work to be committed to master requires mechanisms in place to ensure that such work does not affect the releasability of the framework. Mechanisms such as feature flags and branch by abstraction are slightly more complicated to implement in a framework (especially one that is highly tuned for performance) than in a typical application, but not excessively so. And the advantage of keeping all of the ongoing work together is well worth the effort. What are the community's thoughts on trunk-based development? 323 area-System.Text Decrease in working set was incorrectly marked as test failure "I am hitting a test failure that does not have a consistent repro. ``` Err_004888anied Total memory increased significantly there may be a memory leak (TaskId:339) Percent Difference=0.338227637196709, Max Acceptable Percent Difference=0.3, Start Working Set=576180, End Working Set=381300 (TaskId:339) ``` We probably should not use `Math.abs()` to calculate the difference at https://github.com/dotnet/corefx/blob/master/src/System.Text.RegularExpressions/tests/RegexLangElementsCoverageTests.cs#L60 ``` csharp percentDifference = ((double)Math.Abs(startWorkingSet - endWorkingSet)) / startWorkingSet; if (maxAcceptablePercentDifference < percentDifference) { Console.WriteLine(""Err_004888anied Total memory increased significantly there may be a memory leak""); Console.WriteLine(""Percent Difference={0}, Max Acceptable Percent Difference={1}, Start Working Set={2}, End Working Set={3}"", percentDifference, maxAcceptablePercentDifference, startWorkingSet, endWorkingSet); iCountErrors++; } ``` " 325 area-System.Numerics StringBuilder append with single length strings changed to chars Some StringBuilder.Append calls made with single length symbol strings. These can be a char so that they get directly assigned to the internal char array buffer. Also more consistent with other areas of the codebase. 326 area-Meta .NET Core and pattern compatability (@terrajobst rightly [pointed out](https://github.com/dotnet/corefx/issues/271#issuecomment-68022859) that the thread on implementing collection interfaces on Regex classes had wandered into abstract discussion not related to the original issue. So I am opening this issue in order to continue the discussion here.) To summarize my point (see the [Regex thread](https://github.com/dotnet/corefx/issues/271) for more details): with the addition of IReadOnlyList in .NET 4.5, I don't believe it is appropriate to continue implementing mutable collection interfaces such as IList<T> on immutable collections. The fact that existing immutable collections (including those in the Immutable Collections library) implement these interfaces is not a good enough reason to continue this flawed pattern. This issue is a particular example of the larger question of how .NET Core will evolve, and how much tolerance it has to evolve _away_ from flawed patterns in the .NET Framework. 329 area-Infrastructure Use `nuget restore` instead of `nuget install` See [this line](https://github.com/dotnet/corefx/pull/324#discussion-diff-22273168R41) and the discussion for details. 338 area-System.Text RegexCharClass Improvements and Cleanup Apologies in advance for the large pull request. Normally, I'd break these changes up into separate smaller pull requests, but in this case the changes were all to the same file and some changes build on top of earlier changes, so it was easier to submit as one larger pull request. Each commit is factored appropriately, so it may be easiest to review each commit separately for more sane diffs. The most significant change was to precompute the constants instead of computing the values at runtime in the static constructor. I have a [one-off test](https://gist.github.com/justinvp/0c1b5faf72349b56a2ed) (not intended to be checked-in) that asserts that the precomputed values from the new implementation are equal to the values that would have been computed at runtime using the previous implementation. 340 area-System.Text Support for RegexOptions.Compiled? `RegexOptions.Compiled` isn't currently supported (the option exists but doesn't do anything). Will this be supported? If so, when? If not, why not? 342 area-System.Text More Regex Cleanup Sorry for another larger pull request -- there are just so many opportunities for cleanup in Regex! Most of the commits are pretty fine-grained (some probably could have been squashed together, sorry). 344 area-System.Security Support for Rfc2898DeriveBytes and other founding cryptographic support Hi, Reading the contribution DOs and DON'TS it says not to file an API addition without discussing first. I have the System.Security.Cryptography.Rfc2898DeriveBytes essentially created in a PCL so works in WinRT etc. I'd like to expand on this as Rfc2898DeriveBytes is HmacSha1 based and people are trying to move onto hmacsha256 and upwards nowadays and I'd like to bring this into .NET Core I'd like to start bringing in System.Security.Cryptography namespace into .NET Core. I'll include tests from the test data listed with the cryptographic standards and whatnot where possible. Can I do this? :) 348 area-Meta Updated README.md with recently added projects 349 area-System.Threading TaskHelpers cleanup TaskHelpers.cs is a temporary stop-gap for assemblies that don't yet depend on the latest System.Threading.Tasks contract. This change removes the dependency on the file for those assemblies that do use the latest, changing them to access the Task.CompletedTask, Task.FromCanceled, and Task.FromException members on Task instead of the corresponding workarounds on TaskHelpers. This file should eventually go away entirely. 350 area-System.Console Use StringBuilderCache in additional assemblies "StringBuilderCache maintains a cache of up to a single StringBuilder per thread, avoiding StringBuilder allocations for the common pattern of: ``` C# StringBuilder sb = new StringBuilder(); // ... do some appends return sb.ToString(); ``` and instead enabling: ``` C# StringBuilder sb = StringBuilderCache.Acquire(); // ... do some appends return StringBuilderCache.GetStringAndRelease(sb); ``` It was added to the repo as part of System.IO.FileSystem. This change deploys usage of it through other assemblies. Not all ""new StringBuilder(...)"" usage has been replaced, just those instances that are expected to be most impactful, e.g. tests were ignored, places where StringBuilders are stored into fields were ignored, debug code was ignored, less common members were ignored, etc. Simple microbenchmarks showed measurable improvements as a result. For example, a test around Regex.Replace showed that ~10% of the allocations incurred were for StringBuilder instances; using StringBuilderCache mostly eliminated these and improved throughput by a corresponding ~10%. This will of course vary based on the test employed, but it's a data point. " 352 area-Meta Design document for performance patterns As noted in #351, some patterns which are commonly used in the .NET Framework for performance reasons are not obvious to users who are new to this project. Due to the number of projects which leverage the .NET library as part of their own applications, performance is a particularly sensitive area, especially in regards to [unnecessary] memory allocations. It would be very helpful to have a document explaining commonly used patterns intended to avoid unnecessary memory allocations, and perhaps other techniques which your prior work has indicated are high-value practices for some reason. 361 area-System.Net Ability to set the HttpWebRequest.DefaultMaximumErrorResponseLength on a per request basis I realize the http code hasn't been imported yet.. But it would be really nice if this could be changed in >=4.6 Currently the only way you can set the DefaultMaximumErrorResponseLength for a request is AppDomain wide. We should also be able to be set on a per request basis (like we can do with ServicePoint). 362 area-System.Console Bring back Console.CancelKeyPress We need this so that we can wait on Ctrl + C events in console applications. issue #311 371 area-System.IO Unix Interop Code is platform specific In commits like this: https://github.com/dotnet/corefx/commit/377dde76c5df105a7bab2c0a3002c0d835ff1b07 An attempt has been made to map both structures and constants to C# managed structures. Unix does not have a binary interface for either the constants or the structures, so the definition is likely already wrong for some platforms and will likely be wrong for ports, even within Linux itself. While the structures are constants do change from Unix to Unix a little known quirk is that both structures and constants also change in size, layout and the values in Linux itself. Early versions of Linux chose to emulate the ABI of the host platform, allowing early Linux to easily execute binaries written for the mainstream operating system of the platform. Linux/Alpha copied OSF/1; Linux/SPARC copied SunOS 4; Linux/MIPS some variation of IRIX and so on. This is why Linux includes many of these definitions in /usr/include/arch/XXX, because they are different in each platform. You should do something similar to what Mono.Posix does, which is to have a managed wrapper that talks to a C library that actually performs the native call and marshals the native version to a well-known structure shared with the managed implementation. 377 area-System.Text Regex tests intermittently fail due to memory usage "In RegexLangElementsCoverageTests.cs, there are several checks related to the total memory usage of the test: ``` CSharp // ** RegexLangElementsCoverageTests.cs ** double maxAcceptablePercentDifference = .3; startWorkingSet = GC.GetTotalMemory(true); ... endWorkingSet = GC.GetTotalMemory(true); percentDifference = ((double)Math.Abs(startWorkingSet - endWorkingSet)) / startWorkingSet; if (maxAcceptablePercentDifference < percentDifference) { Console.WriteLine(""Err_004888anied Total memory increased significantly there may be a memory leak""); Console.WriteLine(""Percent Difference={0}, Max Acceptable Percent Difference={1}, Start Working Set={2}, End Working Set={3}"", percentDifference, maxAcceptablePercentDifference, startWorkingSet, endWorkingSet); iCountErrors++; } ``` I think this check may be causing us more bad than good. I'm not sure what GC.GetTotalMemory takes into consideration, but this logic might also be getting tripped up by the fact that we have multiple tests running in parallel. Also, in general, this seems like a bit of a flaky check in the first place; it fails around 10% of the time for me, I would say. Anyone opposed to removing these checks, or at least making it not an error? " 381 area-Meta Question: Single statement `if` formatting "Looking at `System.Threading.Tasks.Dataflow` (amazing library by the way), I see three different variations of formatting of single-statement `if` statements. In _one_ method. From [`DataflowBlock.OutputAvailableAsync`](https://github.com/dotnet/corefx/blob/2bf5a7185fd81949da4d47a432662d8fe80989f6/src/System.Threading.Tasks.Dataflow/src/Base/DataflowBlock.cs#L1485-L1550) [**One line**](https://github.com/dotnet/corefx/blob/2bf5a7185fd81949da4d47a432662d8fe80989f6/src/System.Threading.Tasks.Dataflow/src/Base/DataflowBlock.cs#L1489): ``` c# if (source == null) throw new ArgumentNullException(""source""); ``` [**Two lines**](https://github.com/dotnet/corefx/blob/2bf5a7185fd81949da4d47a432662d8fe80989f6/src/System.Threading.Tasks.Dataflow/src/Base/DataflowBlock.cs#L1493-L1494): ``` c# if (cancellationToken.IsCancellationRequested) return Common.CreateTaskFromCancellation<bool>(cancellationToken); ``` [**With braces**](https://github.com/dotnet/corefx/blob/2bf5a7185fd81949da4d47a432662d8fe80989f6/src/System.Threading.Tasks.Dataflow/src/Base/DataflowBlock.cs#L1513-L1516): ``` c# if (target.Task.IsCompleted) { return target.Task; } ``` Are the guidelines for formatting these more nuanced than I'm able to see, or is this just inconsistent formatting? Is this kind of inconsistent formatting normal and acceptable? Is there a preference for how new code should be formatted? " 384 area-System.Collections Application of ObsoleteAttribute to legacy collection types As mentioned in #370, the lack of `[Obsolete]` on certain legacy types in `System.Collections.Specialized` could be confusing to developers. In general I support the use of ObsoleteAttribute, but for this particular library I would recommend it only be applied to types which have well-known newer (better) alternatives. I'm working on the following table... | Legacy Type | Replacement | | --- | --- | | `BitVector32` | No direct replacement? | | `HybridDictionary` | No direct replacement? | | `IOrderedDictionary` | No direct replacement? | | `ListDictionary` | No direct replacement? | | `NameObjectCollectionBase` | No direct replacement? | | `NameValueCollection` | No direct replacement? | | `OrderedDictionary` | No direct replacement? | | `StringCollection` | `List<string>` | | `StringEnumerator` | Only used with `StringCollection` (not marked obsolete?) | | `StringDictionary` | `Dictionary<string, string>` |
One dataset is used for training and the other dataset is used for testing the accuracy post-training. We can train the model by launching the playground and executing the following command in the Linux terminal that will open once we click the “Run” button:
mlnet classification \--dataset "TrainingData/issues_train.tsv" \--test-dataset "TrainingData/issues_test.tsv" \--has-header true \--label-col 1 \--name MulticlassModelDemo \--ignore-cols 0 \--train-time 30
In this command, we apply the --ignore-cols
parameter to the first column (index 0
), as we know that this column contains a unique identifier that doesn’t contain any meaningful information in the context of training.
Note: Due to the size limit of the playground, only a small fraction of the original dataset is present, which will significantly affect the accuracy of the trained model. Therefore, the accuracy scores that will be shown while ...